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Section 28.7: Using incorrect format specifier in printf 


Using an incorrect format specifier in the first argument to printf invokes undefined behavior. For example, the 
code below invokes undefined behavior: 


long z = 'B'; 
printf (tseni, z); 


Here is another example 
printf("%f\n",@); 


Above line of code is undefined behavior. %f expects double. However 0 is of type int. 


Note that your compiler usually can help you avoid cases like these, if you turn on the proper flags during compiling 
(-Wformat in clang and gcc). From the last example: 


warning: format specifies type ‘double’ but the argument has type 
‘int’ [-Wformat ] 
printf("%f\n",@); 


A 


an 


%d 


Section 28.8: Modify string literal 


In this code example, the char pointer p is initialized to the address of a string literal. Attempting to modify the 
string literal has undefined behavior. 


char *p = “hello world"; 
p[@] = 'H'; // Undefined behavior 


However, modifying a mutable array of char directly, or through a pointer is naturally not undefined behavior, even 
if its initializer is a literal string. The following is fine: 


char all] = “hello, world"; 
char *p = a; 


alo] 
p[7] 


oil 
== 


That's because the string literal is effectively copied to the array each time the array is initialized (once for variables 
with static duration, each time the array is created for variables with automatic or thread duration — variables with 
allocated duration aren't initialized), and it is fine to modify array contents. 


Section 28.9: Passing a null pointer to printf %s conversion 


The %s conversion of printf states that the corresponding argument a pointer to the initial element of an array of 
character type. A null pointer does not point to the initial element of any array of character type, and thus the 
behavior of the following is undefined: 


char *foo = NULL; 
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printf("%s", foo); /* undefined behavior */ 


However, the undefined behavior does not always mean that the program crashes — some systems take steps to 
avoid the crash that normally happens when a null pointer is dereferenced. For example Glibc is known to print 


(null) 
for the code above. However, add (just) a newline to the format string and you will get a crash: 


char xfoo = @; 
printf("%s\n", foo); /* undefined behavior */ 


In this case, it happens because GCC has an optimization that turns printf("%s\n", argument) ; into a call to puts 
with puts(argument), and puts in Glibc does not handle null pointers. All this behavior is standard conforming. 


Note that null pointer is different from an empty string. So, the following is valid and has no undefined behaviour. It'll 
just print a newline: 


char *foo = 
printf("%s\n", foo); 


Section 28.10: Modifying any object more than once between 
two sequence points 


int i = 42; 
i = i++; /* Assignment changes variable, post-increment as well */ 
int a= IH Hi, 


Code like this often leads to speculations about the "resulting value" of i. Rather than specifying an outcome, 
however, the C standards specify that evaluating such an expression produces undefined behavior. Prior to C2011, 
the standard formalized these rules in terms of so-called sequence points: 


Between the previous and next sequence point a scalar object shall have its stored value modified at 
most once by the evaluation of an expression. Furthermore, the prior value shall be read only to 
determine the value to be stored. 


(C99 standard, section 6.5, paragraph 2) 


That scheme proved to be a little too coarse, resulting in some expressions exhibiting undefined behavior with 
respect to C99 that plausibly should not do. C2011 retains sequence points, but introduces a more nuanced 
approach to this area based on sequencing and a relationship it calls "sequenced before": 


If a side effect on a scalar object is unsequenced relative to either a different side effect on the same 
scalar object or a value computation using the value of the same scalar object, the behavior is undefined. 
If there are multiple allowable orderings of the subexpressions of an expression, the behavior is 
undefined if such an unsequenced side effect occurs in any of the orderings. 


(C2011 standard, section 6.5, paragraph 2) 


The full details of the "sequenced before" relation are too long to describe here, but they supplement sequence 
points rather than supplanting them, so they have the effect of defining behavior for some evaluations whose 
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behavior previously was undefined. In particular, if there is a sequence point between two evaluations, then the 
one before the sequence point is "sequenced before" the one after. 


The following example has well-defined behaviour: 


int i = 42; 
i = (i++, i+42); /* The comma-operator creates a sequence point */ 


The following example has undefined behaviour: 


int i = 42; 
printf("%d %d\n", i++, i++); /* commas as separator of function arguments are not comma-operators */ 


As with any form of undefined behavior, observing the actual behavior of evaluating expressions that violate the 


sequencing rules is not informative, except in a retrospective sense. The language standard provides no basis for 
expecting such observations to be predictive even of the future behavior of the same program. 


Section 28.11: Freeing memory twice 


Freeing memory twice is undefined behavior, e.g. 


Quote from standard(7.20.3.2. The free function of C99 ): 


Otherwise, if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc 
function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined. 


Section 28.12: Bit shifting using negative counts or beyond the 
width of the type 


If the shift count value is a negative value then both left shift and right shift operations are undefined1: 


int x = 5 << -3; /* undefined */ 
int x = 5 >> -3; /* undefined */ 


If left shift is performed on a negative value, it's undefined: 
int x = -5 << 3; /* undefined */ 


If left shift is performed on a positive value and result of the mathematical value is not representable in the type, 
it's undefined1: 


/* Assuming an int is 32-bits wide, the value '5 * 2^72' doesn't fit 
* in an int. So, this is undefined. */ 


int x= 5 << 72: 


Note that right shift on a negative value (.e.g -5 >> 3) is not undefined but implementation-defined. 
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1 Quoting ISO/IEC 9899:201x, section 6.5.7: 


If the value of the right operand is negative or is greater than or equal to the width of the promoted left 
operand, the behavior is undefined. 


Section 28.13: Returning from a function that's declared with 
_Noreturn or noreturn function specifier 


Version = C11 


The function specifier _Noreturn was introduced in C11. The header <stdnoreturn.h> provides a macro noreturn 
which expands to _Noreturn. So using _Noreturn or noreturn from <stdnoreturn.h> is fine and equivalent. 


A function that's declared with _Noreturn (or noreturn) is not allowed to return to its caller. If such a function does 
return to its caller, the behavior is undefined. 


In the following example, func() is declared with noreturn specifier but it returns to its caller. 


#include <stdio.h> 
#include <stdlib.h> 
#include <stdnoreturn.h> 


noreturn void func(void) ; 


void func(void) 


{ 
printi im funeQ)...\n" ie 
} /* Undefined behavior as func() returns */ 


int main(void) 
{ 
func(); 
return ð; 


gcc and clang produce warnings for the above program: 


$ gcc test.c 
test.c: In function ‘func’: 
test.c:9:1: warning: ‘noreturn’ function does return 


} 


^ 


$ clang test.c 
test.c:9:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn] 


} 

An example using noreturn that has well-defined behavior: 
#include <stdio.h> 

#include <stdlib.h> 


#include <stdnoreturn.h> 


noreturn void my_exit(void) ; 
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/* calls exit() and doesn't return to its caller. */ 
void my_exit(void) 


{ 
print hi(Exttingea.\n)? 
exit(@) ; 
} 
int main(void) 
{ 
my_exit(); 
return ð; 
} 


Section 28.14: Accessing memory beyond allocated chunk 


Aa pointer to a piece of memory containing n elements may only be dereferenced if it is in the range memory and 
memory + (n - 1). Dereferencing a pointer outside of that range results in undefined behavior. As an example, 
consider the following code: 


int array[3]; 
int *beyond_array = array + 3; 
*beyond_array = @; /* Accesses memory that has not been allocated. */ 


The third line accesses the 4th element in an array that is only 3 elements long, leading to undefined behavior. 
Similarly, the behavior of the second line in the following code fragment is also not well defined: 


int array[3]; 
array[3] = ð; 


Note that pointing past the last element of an array is not undefined behavior (beyond_array = array + 3 is well 
defined here), but dereferencing it is (*beyond_array is undefined behavior). This rule also holds for dynamically 
allocated memory (such as buffers created through malloc). 


Section 28.15: Modifying a const variable using a pointer 


int main (void) 


{ 
const int foo_readonly = 10; 
int *foo_ptr; 
foo_ptr = (int *)&foo_readonly; /* (1) This casts away the const qualifier */ 
*foo_ptr = 20; /* This is undefined behavior */ 
return @; 
} 


Quoting ISO/IEC 9899:201x, section 6.7.3 §2: 


If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue 
with non-const-qualified type, the behavior is undefined. [...] 


(1) In GCC this can throw the following warning: warning: assignment discards ‘const’ qualifier from pointer 
target type |-Wdiscarded-qualifiers] 
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Section 28.16: Reading an uninitialized object that is not 
backed by memory 


Version = C11 
Reading an object will cause undefined behavior, if the object is1: 


e uninitialized 
e defined with automatic storage duration 
e it's address is never taken 


The variable a in the below example satisfies all those conditions: 


void Function( void ) 


{ 
int a; 
int b = a; 


1 (Quoted from: ISO:IEC 9899:201X 6.3.2.1 Lvalues, arrays, and function designators 2) 

If the lvalue designates an object of automatic storage duration that could have been declared with the register 
storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no 
assignment to it has been performed prior to use), the behavior is undefined. 


Section 28.17: Addition or subtraction of pointer not properly 
bounded 


The following code has undefined behavior: 


char buffer[6] = "hello"; 

char *ptr1 = buffer - 1; /* undefined behavior */ 

char *ptr2 = buffer + 5; /* OK, pointing to the '\@' inside the array */ 
char *ptr3 = buffer + 6; /* OK, pointing to just beyond */ 

char *ptr4 = buffer + 7; /* undefined behavior */ 


According to C11, if addition or subtraction of a pointer into, or just beyond, an array object and an integer type 
produces a result that does not point into, or just beyond, the same array object, the behavior is undefined (6.5.6). 


Additionally it is naturally undefined behavior to dereference a pointer that points to just beyond the array: 


char buffer[6] = "hello"; 
char *ptr3 = buffer + 6; /* OK, pointing to just beyond */ 
char value = *ptr3; /* undefined behavior */ 


Section 28.18: Dereferencing a null pointer 
This is an example of dereferencing a NULL pointer, causing undefined behavior. 


int * pointer = NULL; 
int value = *pointer; /* Dereferencing happens here */ 


ANULL pointer is guaranteed by the C standard to compare unequal to any pointer to a valid object, and 
dereferencing it invokes undefined behavior. 
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Section 28.19: Using fflush on an input stream 


The POSIX and C standards explicitly state that using fflush on an input stream is undefined behavior. The fflush 
is defined only for output streams. 


#include <stdio.h> 


int main() 


{ 
inte is 
char input[4996]; 


scanf("%i", &i); 
fflush(stdin); // <-- undefined behavior 
gets(input) ; 


return ð; 


There is no standard way to discard unread characters from an input stream. On the other hand, some 
implementations uses fflush to clear stdin buffer. Microsoft defines the behavior of fflush on an input stream: If 
the stream is open for input, fflush clears the contents of the buffer. According to POSIX.1-2008, the behavior of 
fflush is undefined unless the input file is seekable. 


See Using fflush(stdin) for many more details. 


Section 28.20: Inconsistent linkage of identifiers 


extern int var; 
static int var; /* Undefined behaviour */ 


C11, §6.2.2, 7 says: 


If, within a translation unit, the same identifier appears with both internal and external linkage, the 
behavior is undefined. 


Note that if an prior declaration of an identifier is visible then it'll have the prior declaration's linkage. C11, §6.2.2, 4 
allows it: 


For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of 
that identifier is visible,31) if the prior declaration specifies internal or external linkage, the linkage of the 
identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior 
declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external 
linkage. 


/* 1. This is NOT undefined */ 
static int var; 
extern int var; 


/* 2. This is NOT undefined */ 
static int var; 
static int var; 
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/* 3. This is NOT undefined */ 
extern int var; 
extern int var; 


Section 28.21: Missing return statement in value returning 
function 


int foo(void) { 
/* do stuff */ 
/* no return here */ 


} 


int main(void) { 


/* Trying to use the (not) returned value causes UB */ 
int value = foo(); 
return @; 


} 


When a function is declared to return a value then it has to do so on every possible code path through it. Undefined 
behavior occurs as soon as the caller (which is expecting a return value) tries to use the return value1. 


Note that the undefined behaviour happens only if the caller attempts to use/access the value from the function. 
For example, 


int foo(void) { 
A do Stuti 7 
/* no return here */ 


} 


int main(void) { 
/* The value (not) returned from foo() is unused. So, this program 


* doesn't cause *undefined behaviour. */ 
foo(); 


return @; 


} 


Version = C99 


The main() function is an exception to this rule in that it is possible for it to be terminated without a return 
statement because an assumed return value of 8 will automatically be used in this case2. 


1 (ISO/IEC 9899:201x, 6.9.1/12) 


If the } that terminates a function is reached, and the value of the function call is used by the caller, the 
behavior is undefined. 


2 (ISO/IEC 9899:201x, 5.1.2.2.3/1) 


reaching the } that terminates the main function returns a value of 0. 


Section 28.22: Division by zero 


int x = ð; 
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int y= 5 / x; /* integer division */ 


or 


double x 
double y 


II 
a ® 


II 


/ x; /* floating point division */ 
or 


int x = @; 
int y = 5 % x; /* modulo operation */ 


For the second line in each example, where the value of the second operand (x) is zero, the behaviour is undefined. 


Note that most implementations of floating point math will follow a standard (e.g. IEEE 754), in which case 
operations like divide-by-zero will have consistent results (e.g., INFINITY) even though the C standard says the 
operation is undefined. 


Section 28.23: Conversion between pointer types produces 
incorrectly aligned result 


The following might have undefined behavior due to incorrect pointer alignment: 


char *memory_block = calloc(sizeof(uint32_t) + 1, 1); 
uint32_t *intptr = (uint32_t*)(memory_block + 1); /* possible undefined behavior */ 
uint32_t mvalue = *intptr; 


The undefined behavior happens as the pointer is converted. According to C11, if a conversion between two pointer 
types produces a result that is incorrectly aligned (6.3.2.3), the behavior is undefined. Here an uint32_t could require 
alignment of 2 or 4 for example. 


calloc on the other hand is required to return a pointer that is suitably aligned for any object type; thus 
memory_block is properly aligned to contain an uint32_t in its initial part. Then, on a system where uint32_t has 
required alignment of 2 or 4, memory_block + 1 will be an odd address and thus not properly aligned. 


Observe that the C standard requests that already the cast operation is undefined. This is imposed because on 
platforms where addresses are segmented, the byte address memory_block + 1 may not even have a proper 
representation as an integer pointer. 


Casting char * to pointers to other types without any concern to alignment requirements is sometimes incorrectly 
used for decoding packed structures such as file headers or network packets. 


You can avoid the undefined behavior arising from misaligned pointer conversion by using memcpy: 


memcpy(&mvalue, memory_block + 1, sizeof mvalue) ; 


Here no pointer conversion to uint32_t* takes place and the bytes are copied one by one. 
This copy operation for our example only leads to valid value of mvalue because: 


e We used calloc, so the bytes are properly initialized. In our case all bytes have value 9, but any other proper 
initialization would do. 

e uint32_t is an exact width type and has no padding bits 

e Any arbitrary bit pattern is a valid representation for any unsigned type. 
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Section 28.24: Modifying the string returned by getenv, 
strerror, and setlocale functions 


Modifying the strings returned by the standard functions getenv(), strerror() and setlocale() is undefined. So, 
implementations may use static storage for these strings. 


The getenv() function, C11, §7.22.4.7, 4, says: 
The getenv function returns a pointer to a string associated with the matched list member. The string 


pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the 
getenv function. 


The strerror() function, C11, §7.23.6.3, 4 says: 


The strerror function returns a pointer to the string, the contents of which are localespecific. The array 
pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the 
strerror function. 


The setlocale() function, C11, §7.11.1.1, 8 says: 


The pointer to string returned by the setlocale function is such that a subsequent call with that string 
value and its associated category will restore that part of the program’s locale. The string pointed to shall 
not be modified by the program, but may be overwritten by a subsequent call to the setlocale function. 
Similarly the localeconv() function returns a pointer to struct lconv which shall not be modified. 
The localeconv() function, C11, §7.11.2.1, 8 says: 
The localeconv function returns a pointer to the filled-in object. The structure pointed to by the return 


value shall not be modified by the program, but may be overwritten by a subsequent call to the 
localeconv function. 
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Chapter 29: Random Number Generation 


Section 29.1: Basic Random Number Generation 


The function rand() can be used to generate a pseudo-random integer value between 8 and RAND_MAX (8 and 
RAND_MAX included). 


srand(int) is used to seed the pseudo-random number generator. Each time rand() is seeded wih the same seed, 
it must produce the same sequence of values. It should only be seeded once before calling rand(). It should not be 
repeatedly seeded, or reseeded every time you wish to generate a new batch of pseudo-random numbers. 


Standard practice is to use the result of time(NULL) as a seed. If your random number generator requires to have a 
deterministic sequence, you can seed the generator with the same value on each program start. This is generally 
not required for release code, but is useful in debug runs to make bugs reproducible. 


It is advised to always seed the generator, if not seeded, it behaves as if it was seeded with srand(1). 
#include <stdio.h> 
#include <stdlib.h> 
#include <time.h> 
int main(void) { 
int ake 
srand(time(NULL) ) ; 


i= rand(): 


printf("Random value between [@, %d]: %d\n", RAND_MAX, i); 
return ð; 


Possible output: 
Random value between [0, 2147483647]: 823321433 


Notes: 


The C Standard does not guarantee the quality of the random sequence produced. In the past, some 
implementations of rand() had serious issues in distribution and randomness of the generated numbers. The 
usage of rand() is not recommended for serious random number generation needs, like cryptography. 


Section 29.2: Permuted Congruential Generator 


Here's a standalone random number generator that doesn't rely on rand() or similar library functions. 


Why would you want such a thing? Maybe you don't trust your platform's builtin random number generator, or 
maybe you want a reproducible source of randomness independent of any particular library implementation. 


This code is PCG32 from pcg-random.org, a modern, fast, general-purpose RNG with excellent statistical properties. 
It's not cryptographically secure, so don't use it for cryptography. 


#include <stdint.h> 


/* *Really* minimal PCG32 code / (c) 2014 M.E. O'Neill / pcg-random.org 
* Licensed under Apache License 2.@ (NO WARRANTY, etc. see website) */ 
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typedef struct { uint64_t state; uint64_t inc; } pcg32_random_t; 


uint32_t pcg32_random_r(pcg32_random_t* rng) { 
uint64_t oldstate = rng->state; 
/* Advance internal state */ 
rng->state = oldstate * 6364136223846793005ULL + (rng->inc | 1); 
/* Calculate output function (XSH RR), uses old state for max ILP */ 
uint32_t xorshifted = ((oldstate >> 18u) * oldstate) >> 27u; 
uint32_t rot = oldstate >> 59u; 


return (xorshifted >> rot) | (xorshifted << ((-rot) & 31)); 
} 


void pcg32_srandom_r(pcg32_random_t* rng, uint64_t initstate, uint64_t initseq) { 
rng->state = QU; 
rng->inc = (initseq << 1u) | 1u; 
pcg32_random_r(rng) ; 
rng->state += initstate; 
pcg32_random_r(rng) ; 


And here's how to call it: 


#include <stdio.h> 

int main(void) { 
pcg32_random_t rng; /* RNG state */ 
Ime i: 


/* Seed the RNG */ 
pcg32_srandom_r(&rng, 42u, 54u); 


/* Print some random 32-bit integers */ 
for (G = 0; i < 6; itt) 
printf("0x%08x\n", pcg32_random_r(&rng) ); 


return @; 


Section 29.3: Xorshift Generation 


A good and easy alternative to the flawed rand() procedures, is xorshift, a class of pseudo-random number 
generators discovered by George Marsaglia. The xorshift generator is among the fastest non-cryptographically- 
secure random number generators. More information and other example implementaions are available on the 
xorshift Wikipedia page 


Example implementation 
#include <stdint.h> 


/* These state variables must be initialised so that they are not all zero. */ 
UINtSZ EW X Y Zi 


uint32_t xorshift128(void) 
{ 

uint32_t t = xX; 

t 4= t << 11U; 

t ^= t >> 8U; 

KS YG YZ ZEW; 

w ^= w >> 19U; 

w ^= t; 


GoalKicker.com - C Notes for Professionals 181 


return w; 


Section 29.4: Restrict generation to a given range 


Usually when generating random numbers it is useful to generate integers within a range, or a p value between 0.0 
and 1.0. Whilst modulus operation can be used to reduce the seed to a low integer this uses the low bits, which 
often go through a short cycle, resulting in a slight skewing of distribution if N is large in proportion to RAND_MAX. 


The macro 

#define uniform() (rand() / (RAND_MAX + 1.@)) 

produces a p value on 0.0 to 1.0 - epsilon, so 

i = (int)(uniform() * N) 

will set i to a uniform random number within the range 0 to N - 1. 


Unfortunately there is a technical flaw, in that RAND_MAX is permitted to be larger than a variable of type double 
can accurately represent. This means that RAND_MAX + 1.8 evaluates to RAND_MAX and the function occasionally 
returns unity. This is unlikely however. 
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Chapter 30: Preprocessor and Macros 


All preprocessor commands begins with the hash (pound) symbol #. A C macro is just a preprocessor command 
that is defined using the #define preprocessor directive. During the preprocessing stage, the C preprocessor (a part 
of the C compiler) simply substitutes the body of the macro wherever its name appears. 


Section 30.1: Header Include Guards 


Pretty much every header file should follow the include guard idiom: 


my-header-file.h 


#ifndef MY_HEADER_FILE_H 
#define MY_HEADER_FILE_H 


// Code body for header file 


#endif 


This ensures that when you #include “my-header-file.h" in multiple places, you don't get duplicate declarations 
of functions, variables, etc. Imagine the following hierarchy of files: 


header-1.h 

typedef struct { 

} Mystruet: 

int myFunction(MyStruct *value) ; 


header-2.h 


#include "header-1.h" 
int myFunction2(MyStruct *value) ; 


main.c 


#include "header-1.h" 
#include "header-2.h" 


int main() { 


// do something 


} 


This code has a serious problem: the detailed contents of MyStruct is defined twice, which is not allowed. This 
would result in a compilation error that can be difficult to track down, since one header file includes another. If you 
instead did it with header guards: 


header-1.h 


#ifndef HEADER_1_H 
#define HEADER_1_H 


typedef struct { 
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} MyStruct; 
int myFunction(MyStruct *value) ; 


#endif 


header-2.h 


#ifndef HEADER_2_H 
#define HEADER_2_H 


#include "header-1.h" 
int myFunction2(MyStruct *value) ; 


#endif 


main.c 


#include "header-1.h" 
#include "header-2.h" 


int main() { 


// do something 
} 


This would then expand to: 


#ifndef HEADER_1_H 
#define HEADER_1_H 


typedef struct { 

} uystruct? 

int myFunction(MyStruct *value) ; 
#endif 


#ifndef HEADER_2_H 
#define HEADER_2_H 


#ifndef HEADER_1_H // Safe, since HEADER_1_H was #define'd before. 
#define HEADER_1_H 


typedef struct { 

} Nyse nc: 

int myFunction(MyStruct *value) ; 
#endif 

int myFunction2(MyStruct *value) ; 
#endif 


int main() { 
// do something 
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When the compiler reaches the second inclusion of header-1.h, HEADER_1_H was already defined by the previous 
inclusion. Ergo, it boils down to the following: 


#define HEADER_1_H 

typedef struct { 

} MyStruct: 

int myFunction(MyStruct *value) ; 
#define HEADER_2_H 

int myFunction2(MyStruct *value) ; 
int main() { 


// do something 
} 


And thus there is no compilation error. 


Note: There are multiple different conventions for naming the header guards. Some people like to name it 
HEADER_2_H_, some include the project name like MY_PROJECT_HEADER_2_H. The important thing is to ensure that the 
convention you follow makes it so that each file in your project has a unique header guard. 


If the structure details were not included in the header, the type declared would be incomplete or an opaque type. 
Such types can be useful, hiding implementation details from users of the functions. For many purposes, the FILE 
type in the standard C library can be regarded as an opaque type (though it usually isn't opaque so that macro 
implementations of the standard I/O functions can make use of the internals of the structure). In that case, the 
header-1.h could contain: 


#ifndef HEADER_1_H 
#define HEADER_1_H 


typedef struct MyStruct MyStruct; 
int myFunction(MyStruct *value) ; 


#endif 


Note that the structure must have a tag name (here MyStruct — that's in the tags namespace, separate from the 
ordinary identifiers namespace of the typedef name MyStruct), and that the { ... } is omitted. This says "there is a 
structure type struct MyStruct and there is an alias for it MyStruct". 


In the implementation file, the details of the structure can be defined to make the type complete: 
struct MyStruct { 


If you are using C11, you could repeat the typedef struct MyStruct MyStruct; declaration without causing a 
compilation error, but earlier versions of C would complain. Consequently, it is still best to use the include guard 
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idiom, even though in this example, it would be optional if the code was only ever compiled with compilers that 
supported C11. 


Many compilers support the #pragma once directive, which has the same results: 
my-header-file.h 


#pragma once 


// Code for header file 


However, #pragma once is not part of the C standard, so the code is less portable if you use it. 


A few headers do not use the include guard idiom. One specific example is the standard <assert.h> header. It may 
be included multiple times in a single translation unit, and the effect of doing so depends on whether the macro 
NDEBUG is defined each time the header is included. You may occasionally have an analogous requirement; such 
cases will be few and far between. Ordinarily, your headers should be protected by the include guard idiom. 


Section 30.2: #if O to block out code sections 


If there are sections of code that you are considering removing or want to temporarily disable, you can comment it 
out with a block comment. 


/* Block comment around whole function to keep it from getting used. 
* What's even the purpose of this function? 
int myUnusedFunction(void) 


í 
int i = 5; 
return i; 
$ 
*/ 


However, if the source code you have surrounded with a block comment has block style comments in the source, 
the ending */ of the existing block comments can cause your new block comment to be invalid and cause 
compilation problems. 


/* Block comment around whole function to keep it from getting used. 
* What's even the purpose of this function? 
int myUnusedFunction(void) 


{ 
lite oi S oe 
/* Return 5 */ 
return i; 

} 

*/ 


In the previous example, the last two lines of the function and the last '*/' are seen by the compiler, so it would 
compile with errors. A safer method is to use an #if @ directive around the code you want to block out. 


#if 0 

/* #if @ evaluates to false, so everything between here and the #endif are 
* removed by the preprocessor. */ 

int myUnusedFunction(void) 


{ 
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int i = 5; 
return i; 


} 
#endif 


A benefit with this is that when you want to go back and find the code, it's much easier to do a search for "#if 0" 
than searching all your comments. 


Another very important benefit is that you can nest commenting out code with #if @. This cannot be done with 
comments. 


An alternative to using #if 9 is to use a name that will not be #defined but is more descriptive of why the code is 
being blocked out. For instance if there is a function that seems to be useless dead code you might use #if 
defined(POSSIBLE_DEAD_CODE) or #if defined(FUTURE_CODE_REL_@20201 ) for code needed once other 
functionality is in place or something similar. Then when going back through to remove or enable that source, those 
sections of source are easy to find. 


Section 30.3: Function-like macros 
Function-like macros are similar to inline functions, these are useful in some cases, such as temporary debug log: 


#ifdef DEBUG 
# define LOGFILENAME "/tmp/logfile.log" 


# define LOG(str) do { N 
FILE «fp = fopen(LOGFILENAME, "a"); \ 
aap GN a \ 

fpieaintiCipss ess dr 6S Nie eee ETE S ETNEN 
/* don't print null pointer */ \ 
Siti 2st acna \ 
fclose(fp); \ 
} \ 
else { \ 
perror("Opening '" LOGFILENAME "' failed"); \ 
} \ 
} while (@) 
#else 


/* Make it a NOOP if DEBUG is not defined. */ 
# define LOG(LINE) (void)@ 
#endif 


#include <stdio.h> 


int main(int argc, char* argv[]) 


{ 
if (arge > 1) 
LOG("There are command line arguments"); 
else 
LOG("No command line arguments"); 
return ð; 
} 


Here in both cases (with DEBUG or not) the call behaves the same way as a function with void return type. This 
ensures that the if /else conditionals are interpreted as expected. 


In the DEBUG case this is implemented through ado { ... } while(®) construct. In the other case, (void) @isa 
statement with no side effect that is just ignored. 
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An alternative for the latter would be 
#define LOG(LINE) do { /* empty */ } while (@) 
such that it is in all cases syntactically equivalent to the first. 


If you use GCC, you can also implement a function-like macro that returns result using a non-standard GNU 
extension — statement expressions. For example: 


#include <stdio.h> 


#define POW(X, Y) \ 


CEN 

Mie th, ie ie N 

for G = 0r i < Veet) aN 

r *= X; \ 

r; \ // returned value is result of last operation 
}) 
int main(void) 
{ 

int result; 

result = POW(2, 3); 

printf("Result: %d\n", result); 
} 


Section 30.4: Source file inclusion 


The most common uses of #include preprocessing directives are as in the following: 


#include <stdio.h> 
#include "myheader.h" 


#include replaces the statement with the contents of the file referred to. Angle brackets (<>) refer to header files 
installed on the system, while quotation marks ("") are for user-supplied files. 


Macros themselves can expand other macros once, as this example illustrates: 


#if VERSION == 
#define INCFILE "“vers1.h" 
#elif VERSION == 
#define INCFILE "“vers2.h" 
/* and so on */ 
#else 
#define INCFILE "“versN.h" 
#endif 
Teo ce tHe 
#include INCFILE 


Section 30.5: Conditional inclusion and conditional function 
signature modification 


To conditionally include a block of code, the preprocessor has several directives (e.g #if, #ifdef, #else, #endif, 
etc). 


/* Defines a conditional ‘printf’ macro, which only prints if `DEBUG` 
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* has been defined 
*/ 
#ifdef DEBUG 
#define DLOG(x) (printf(x)) 
#else 
#define DLOG(x) 
#endif 


Normal C relational operators may be used for the #if condition 


#if __STDC_VERSION__ >= 201112L 
/* Do stuff for C11 or higher */ 
#elif __STDC_VERSION__ >= 199901L 
/* Do stuff for C99 x*/ 

#else 

/* Do stuff for pre C99 */ 

#endif 


The #if directives behaves similar to the C if statement, it shall only contain integral constant expressions, and no 
casts. It supports one additional unary operator, defined( identifier ), which returns 1 if the identifier is 
defined, and 8 otherwise. 


#if defined(DEBUG) && !defined( QUIET) 
#define DLOG(x) (printf(x)) 

#else 

#define DLOG(x) 

#endif 


Conditional Function Signature Modification 


In most cases a release build of an application is expected to have as little overhead as possible. However during 
testing of an interim build, additional logs and information about problems found can be helpful. 


For example assume there is some function SHORT SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd) which 
when doing a test build it is desired will generate a log about its use. However this function is used in multiple 
places and it is desired that when generating the log, part of the information is to know where is the function being 
called from. 


So using conditional compilation you can have something like the following in the include file declaring the function. 
This replaces the standard version of the function with a debug version of the function. The preprocessor is used to 
replace calls to the function SerOpP1uA11Read() with calls to the function SerOpP1luAl1Read_Debug() with two 
additional arguments, the name of the file and the line number of where the function is used. 


Conditional compilation is used to choose whether to override the standard function with a debug version or not. 


#if 0 
// function declaration and prototype for our debug version of the function. 
SHORT  SerOpPluAllRead_Debug(PLUIF *pPif, USHORT usLockHnd, char *aszFilePath, int nLineNo); 


// macro definition to replace function call using old name with debug function with additional 
arguments. 

#define SerOpPluAllRead(pPif,usLock) SerOpPluAllRead_Debug(pPif , usLock 
#else 

// standard function declaration that is normally used with builds. 
SHORT  SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd) ; 

#endif 


FILE LINE__) 


1- =- —— 
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This allows you to override the standard version of the function SerOpP1uAl1Read() with a version that will provide 
the name of the file and line number in the file of where the function is called. 


There is one important consideration: any file using this function must include the header file where this approach is 
used in order for the preprocessor to modify the function. Otherwise you will see a linker error. 


The definition of the function would look something like the following. What this source does is to request that the 
preprocessor rename the function SerOpP1uAl1lRead() to be SerOpP1uAl1lRead_Debug( ) and to modify the 
argument list to include two additional arguments, a pointer to the name of the file where the function was called 
and the line number in the file at which the function is used. 


#if defined(SerOpPluAllRead) 
// forward declare the replacement function which we will call once we create our log. 
SHORT SerOpPluAllRead_Special(PLUIF *pPif, USHORT usLockHnd) ; 


SHORT SerOpPluAllRead_Debug(PLUIF *pPif, USHORT usLockHnd, char *aszFilePath, int nLineNo) 
{ 

int iLen = @; 

char xBuffer[256]; 


// only print the last 3@ characters of the file name to shorten the logs. 
iLen = strlen (aszFilePath) ; 
if (iLen > 30) { 

iLen = iLen - 30; 


} 
else { 

iLen = @; 
} 


sprintf (xBuffer, "SerOpPluAllRead_Debug(): husHandle = %d, File %s, lineno = %d", 
pPif->husHandle, aszFilePath + iLen, nLineNo) ; 
IssueDebugLog(xBuffer) ; 


// now that we have issued the log, continue with standard processing. 
return SerOpPluAllRead_Special(pPif, usLockHnd) ; 
} 


// our special replacement function name for when we are generating logs. 

SHORT SerOpPluAllRead_Special(PLUIF *pPif, USHORT usLockHnd) 

#else 

// standard, normal function name (signature) that is replaced with our debug version. 
SHORT  SerOpPluAllRead(PLUIF *pPif, USHORT usLockHnd) 

#endif 


{ 
if (STUB_SELF == SstReadAsMaster()) { 
return OpPluAllRead(pPif, usLockHnd) ; 


} 
return OP_NOT_MASTER; 


Section 30.6: __ cplusplus for using C externals in C++ code 
compiled with C++ - name mangling 


There are times when an include file has to generate different output from the preprocessor depending on whether 
the compiler is a C compiler or a C++ compiler due to language differences. 


For example a function or other external is defined in a C source file but is used in a C++ source file. Since C++ uses 
name mangling (or name decoration) in order to generate unique function names based on function argument 
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types, a C function declaration used in a C++ source file will cause link errors. The C++ compiler will modify the 
specified external name for the compiler output using the name mangling rules for C++. The result is link errors due 
to externals not found when the C++ compiler output is linked with the C compiler output. 


Since C compilers do not do name mangling but C++ compilers do for all external labels (function names or variable 
names) generated by the C++ compiler, a predefined preprocessor macro, __cplusp1lus, was introduced to allow for 
compiler detection. 


== 


In order to work around this problem of incompatible compiler output for external names between C and C++, the 
macro __cplusplus is defined in the C++ Preprocessor and is not defined in the C Preprocessor. This macro name 
can be used with the conditional preprocessor #ifdef directive or #if with the defined() operator to tell whether 
a source code or include file is being compiled as C++ or C. 


#ifdef __cplusplus 
DEInth@er\n N 
#else 

printf (Cni); 
#endif 


Or you could use 


#if defined(__cplusplus) 
PEINth(2C++\n 

#else 

printf (ECAN); 

#endif 


In order to specify the correct function name of a function from a C source file compiled with the C compiler that is 
being used in a C++ source file you could check for the __cplusplus defined constant in order to cause the extern 
"c" { /* ... */ }; to be used to declare C externals when the header file is included in a C++ source file. 
However when compiled with a C compiler, the extern "C" { */ ... */ }; isnot used. This conditional 
compilation is needed because extern "C" { /* ... */ }; is valid in C++ but not in C. 


#ifdef __cplusplus 

// if we are being compiled with a C++ compiler then declare the 
// following functions as C functions to prevent name mangling. 
extern "C" { 

#endif 


// exported C function list. 
int foo (void); 


#ifdef __cplusplus 
// if this is a C++ compiler, we need to close off the extern declaration. 
#endif 


Section 30.7: Token pasting 


Token pasting allows one to glue together two macro arguments. For example, front##back yields frontback. A 
famous example is Win32's <TCHAR.H> header. In standard C, one can write L"string" to declare a wide character 
string. However, Windows API allows one to convert between wide character strings and narrow character strings 
simply by #defineing UNICODE. In order to implement the string literals, TCHAR .H uses this 


#ifdef UNICODE 
#define TEXT(x) L##x 
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#endif 


Whenever a user writes TEXT("hello, world"), and UNICODE is defined, the C preprocessor concatenates L and 
the macro argument. L concatenated with "hello, world" givesL"hello, world". 


Section 30.8: Predefined Macros 


A predefined macro is a macro that is already understood by the C pre processor without the program needing to 
define it. Examples include 


Mandatory Pre-Defined Macros 


e __FILE__, which gives the file name of the current source file (a string literal), 
e __LINE__ for the current line number (an integer constant), 

e __DATE__ for the compilation date (a string literal), 

e __TIME__ for the compilation time (a string literal). 


There's also a related predefined identifier, __func__ (ISO/IEC 9899:2011 §6.4.2.2), which is not a macro: 


The identifier __func__ shall be implicitly declared by the translator as if, immediately following the 
opening brace of each function definition, the declaration: 


static const char __func__[] = "function-name" ; 


appeared, where function-name is the name of the lexically-enclosing function. 


__FILE__, __LINE__ and __func__ are especially useful for debugging purposes. For example: 


fprintf(stderr, "%s: %s: %d: Denominator is 0" FILE func 


1! = --) —-— --) —— 


LINE__); 


Pre-C99 compilers, may or may not support __func__ or may have a macro that acts the same that is named 
differently. For example, gcc used __FUNCTION__ in C89 mode. 


The below macros allow to ask for detail on the implementation: 


e __STDC_VERSION__ The version of the C Standard implemented. This is a constant integer using the format 
yyyymmL (the value 201112L for C11, the value 199981L for C99; it wasn't defined for C89/C90) 

e __STDC_HOSTED__ 1 if it's a hosted implementation, else 9. 

e __STDC__ If 1, the implementation conforms to the C Standard. 


Other Pre-Defined Macros (non mandatory) 


ISO/IEC 9899:2011 §6.10.9.2 Environment macros: 


e __STDC_ISO_10@646__ An integer constant of the form yyyymmL (for example, 199712L). If this 
symbol is defined, then every character in the Unicode required set, when stored in an object of 
type wchar_t, has the same value as the short identifier of that character. The Unicode required set 
consists of all the characters that are defined by ISO/IEC 10646, along with all amendments and 
technical corrigenda, as of the specified year and month. If some other encoding is used, the macro 
shall not be defined and the actual encoding used is implementation-defined. 


e __STDC_MB_MIGHT_NEQ_WC__ The integer constant 1, intended to indicate that, in the encoding for 
wehar_t, a member of the basic character set need not have a code value equal to its value when 
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used as the lone character in an integer character constant. 


__-STDC_UTF_16__ The integer constant 1, intended to indicate that values of type char16_t are 
UTF-16 encoded. If some other encoding is used, the macro shall not be defined and the actual 
encoding used is implementation-defined. 


__STDC_UTF_32__ The integer constant 1, intended to indicate that values of type char32_t are 
UTF-32 encoded. If some other encoding is used, the macro shall not be defined and the actual 
encoding used is implementation-defined. 


ISO/IEC 9899:2011 §6.10.8.3 Conditional feature macros 


_-STDC_ANALYZABLE__ The integer constant 1, intended to indicate conformance to the 
specifications in annex L (Analyzability). 

_-STDC_IEC_559__ The integer constant 1, intended to indicate conformance to the specifications 
in annex F (IEC 60559 floating-point arithmetic). 

__-STDC_IEC_559_COMPLEX__ The integer constant 1, intended to indicate adherence to the 
specifications in annex G (IEC 60559 compatible complex arithmetic). 

__STDC_LIB_EXT1__ The integer constant 201112L, intended to indicate support for the extensions 
defined in annex K (Bounds-checking interfaces). 

_-STDC_NO_ATOMICS__ The integer constant 1, intended to indicate that the implementation does 
not support atomic types (including the _Atomic type qualifier) and the <stdatomic.h> header. 
__-STDC_NO_COMPLEX__ The integer constant 1, intended to indicate that the implementation does 
not support complex types or the <complex.h> header. 

__-STDC_NO_THREADS__ The integer constant 1, intended to indicate that the implementation does 
not support the <threads.h> header. 

_-STDC_NO_VLA__ The integer constant 1, intended to indicate that the implementation does not 
support variable length arrays or variably modified types. 


Section 30.9: Variadic arguments macro 


Version = C99 


Macros with variadic args: 


Let's say you want to create some print-macro for debugging your code, let's take this macro as an example: 


#define debug_print(msg) printf("%s:%d %s" 


FILE LINE__, msg) 


1 =- --) —— 


Some examples of usage: 


The function somefunc() returns -1 if failed and 0 if succeeded, and it is called from plenty different places within 


the code: 


int retVal = somefunc() ; 


if(retVal == -1) 


{ 


debug_printf("somefunc() has failed"); 


} 


/* some other code */ 


Goalkicker.com - C Notes for Professionals 


193 


retVal = somefunc(); 


if(retVal == -1) 
{ 
debug_printf("somefunc() has failed"); 


} 


What happens if the implementation of somefunc() changes, and it now returns different values matching different 
possible error types? You still want use the debug macro and print the error value. 


debug_printf(retVal) ; /* this would obviously fail */ 
debug_printf("%d",retVal); /* this would also fail */ 


To solve this problem the __VA_ARGS__ macro was introduced. This macro allows multiple parameters X-macro's: 


Example: 
#define debug_print(msg, ...) printf(msg, __VA_ARGS__) \ 
printf("\nError occurred in file:line (%s:%d)\n", __FILE__, __LINE) 
Usage: 


int retVal = somefunc(); 


debug_print("retVal of somefunc() is-> %d", retVal); 


This macro allows you to pass multiple parameters and print them, but now it forbids you from sending any 
parameters at all. 


debug_print("Hey") ; 


This would raise some syntax error as the macro expects at least one more argument and the pre-processor would 
not ignore the lack of comma in the debug_print() macro. Also debug_print("Hey", ) ; would raise a syntax error 
as you cant keep the argument passed to macro empty. 


To solve this, ##__VA_ARGS__ macro was introduced, this macro states that if no variable arguments exist, the 
comma is deleted by the pre-processor from code. 


Example: 
#define debug_print(msg, ...) printf(msg, ##__VA_ARGS__) \ 
printf("\nError occurred in file:line (%s:%d)\n", __FILE__, __LINE) 
Usage: 


debug_print("Ret val of somefunc()?"); 
debug_print("%d", somefunc()) ; 


Section 30.10: Macro Replacement 
The simplest form of macro replacement is to define a manifest constant, as in 


#define ARRSIZE 100 
int array[ARRSIZE]; 
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This defines a function-like macro that multiplies a variable by 18 and stores the new value: 


#define TIMES1@(A) ((A) *= 10) 


double b = 34; 
int c = 2; 


TIMES10(b); // good: ((b) *= 10); 


TIMES1@(c); // good: ((c) *= 10); 
TIMES10(5);  // bad: ((5) *= 10); 


The replacement is done before any other interpretation of the program text. In the first call to TIMES1@ the name A 
from the definition is replaced by b and the so expanded text is then put in place of the call. Note that this 
definition of TIMES1@ is not equivalent to 


#define TIMES10(A) ((A) = (A) * 10) 


because this could evaluate the replacement of A, twice, which can have unwanted side effects. 


The following defines a function-like macro which value is the maximum of its arguments. It has the advantages of 
working for any compatible types of the arguments and of generating in-line code without the overhead of function 
calling. It has the disadvantages of evaluating one or the other of its arguments a second time (including side 
effects) and of generating more code than a function if invoked several times. 


#define max(a, b) ((a) > (b) ? (a) : (b)) 


int maxVal = max(11, 43); /* 43 */ 
int maxValExpr = max(11 + 36, 51 - 7); /* 47 */ 


/* Should not be done, due to expression being evaluated twice */ 


int j = 0, i=; 
int sideEffect = max(++i, ++j); /* i == 4 */ 


Because of this, such macros that evaluate their arguments multiple times are usually avoided in production code. 
Since C11 there is the _Generic feature that allows to avoid such multiple invocations. 


The abundant parentheses in the macro expansions (right hand side of the definition) ensure that the arguments 
and the resulting expression are bound properly and fit well into the context in which the macro is called. 


Section 30.11: Error directive 


If the preprocessor encounters an #error directive, compilation is halted and the diagnostic message included is 
printed. 


#define DEBUG 

#ifdef DEBUG 

#error "Debug Builds Not Supported" 
#endif 

int main(void) { 


return @; 


} 


Possible output: 
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$ gcc error.c 
error.c: error: #error "Debug Builds Not Supported" 


Section 30.12: FOREACH implementation 


We can also use macros for making code easier to read and write. For example we can implement macros for 
implementing the foreach construct in C for some data structures like singly- and doubly-linked lists, queues, etc. 


Here is a Small example. 


#include <stdio.h> 
#include <stdlib.h> 


struct LinkedListNode 
{ 


int data; 
struct LinkedListNode *next; 


ie 


#define FOREACH_LIST(node, list) \ 
for (node=list; node; node=node->next) 


/* Usage */ 
int main(void) 
{ 


struct LinkedListNode *list, **plist = &list, *node; 
abana ake 


for (i=@; i<10; i++) 


{ 
*plist = malloc(sizeof(struct LinkedListNode) ) ; 
(*plist)->data = i; 
(*plist)->next = NULL; 
plist = &(*plist)->next; 
} 


/* printing the elements here */ 
FOREACH_LIST(node, list) 
{ 

printf("%d\n", node->data) ; 
} 


You can make a standard interface for such data-structures and write a generic implementation of FOREACH as: 


#include <stdio.h> 
#include <stdlib.h> 


typedef struct CollectionItem_ 
{ 


int data; 
struct CollectionItem_ *next; 
} CollectionItem; 


typedef struct Collection_ 
{ 
/* interface functions */ 
void* (*first)(void *col1); 
void* (*last) (void *col1); 
void* (*next) (void *coll, CollectionItem *currItem) ; 
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CollectionItem *collectionHead; 
/* Other fields */ 
} Collection; 


/* must implement */ 
void *first(void *coll) 
{ 
return ((Collection*)coll)->collectionHead; 


} 


/* must implement */ 
void *last(void *coll) 
{ 

return NULL; 


} 


/* must implement */ 
void *next(void *coll, CollectionItem *curr) 


{ 
return curr->next; 
} 
CollectionItem *new_CollectionItem(int data) 
{ 
CollectionItem *item = malloc(sizeof(CollectionItem) ) ; 
item->data = data; 
item->next = NULL; 
return item; 
} 
void Add_Collection(Collection *coll, int data) 
{ 
CollectionItem **item = &coll->collectionHead ; 
while(*item) 
item = &(*item)->next; 
(*item) = new_CollectionItem(data) ; 
} 
Collection *new_Collection() 
{ 
Collection *nc = malloc(sizeof(Collection)) ; 
ne->first = first; 
nce->last = last; 
nc->next = next; 
return nc; 
} 
/* generic implementation */ 
#define FOREACH(node, collection) \ 
for (node = (collection)->first(collection) ; \ 
node != (collection) ->last(collection) ; \ 
node = (collection)->next(collection, node) ) 


int main(void) 

{ 
Collection *coll = new_Collection() ; 
CollectionItem *node; 
IME i: 


for(i=@; i<10; i++) 


4 
Add_Collection(coll, i); 
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} 


/* printing the elements here */ 
FOREACH(node, coll) 


{ 
printf("%d\n", node->data) ; 


} 
To use this generic implementation just implement these functions for your data structure. 


1. voids (*first)(void *coll); 
2. void* (*last) (void *coll); 
3. void* (*next) (void *coll, CollectionItem *currItem) ; 
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Chapter 31: Signal handling 


Parameter Details 
si The signal to set the signal handler to, one of SIGABRT, SIGFPE, SIGILL, SIGTERM, SIGINT, SIGSEGV or 
8 some implementation defined value 


The signal handler, which is either of the following: SIG_DFL, for the default handler, SIG_IGN to ignore 


nurg the signal, or a function pointer with the signature void foo(int sig);. 


Section 31.1: Signal Handling with “signal” 


Signal numbers can be synchronous (like SIGSEGV — segmentation fault) when they are triggered by a 
malfunctioning of the program itself or asynchronous (like SIGINT - interactive attention) when they are initiated 
from outside the program, e.g by a keypress as Cntr1-C. 


The signal() function is part of the ISO C standard and can be used to assign a function to handle a specific signal 


#include <stdio.h> /* printf() */ 
#include <stdlib.h> /* abort() */ 
#include <signal.h> /* signal() */ 


void handler_nonportable(int sig) 


{ 
/* undefined behavior, maybe fine on specific platform */ 
printf("Catched: %d\n", sig); 
/* abort is safe to call */ 
abort(); 
} 


sig_atomic_t volatile finished = ð; 


void handler(int sig) 
{ 
switch (sig) { 
/* hardware interrupts should not return */ 
case SIGSEGV: 
case SIGFPE: 
case SIGILL: 
Version = C11 


/* quick_exit is safe to call */ 
quick_exit (EXIT_FAILURE) ; 


Version < C11 


/* use _Exit in pre-C11 */ 
—Exit (EXIT_FAILURE) ; 


default: 

/* Reset the signal to the default handler, 
so we will not be called again if things go 
wrong on return. */ 

signal(sig, SIG_DFL); 

/* let everybody know that we are finished */ 

finished = sig; 

return; 


} 


int main(void) 


{ 
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/* Catch the SIGSEGV signal, raised on segmentation faults (i.e NULL ptr access */ 
if (signal(SIGSEGV, &handler) == SIG_ERR) { 

perror("could not establish handler for SIGSEGV"); 

return EXIT_FAILURE; 


} 


/* Catch the SIGTERM signal, termination request */ 

if (signal(SIGTERM, &handler) == SIG_ERR) { 
perror("could not establish handler for SIGTERM") ; 
return EXIT_FAILURE; 

} 


/* Ignore the SIGINT signal, by setting the handler to `SIG_IGN`. */ 
signal(SIGINT, SIG_IGN) ; 


/* Do something that takes some time here, and leaves 
the time to terminate the program from the keyboard. */ 


/* Then: */ 


if (finished) { 
fprintf(stderr, "we have been terminated by signal %d\n", (int)finished) ; 
return EXIT_FAILURE ; 


/* Try to force a segmentation fault, and raise a SIGSEGV */ 


{ 
char* ptr = @; 
xptr = ð; 

} 


/* This should never be executed */ 
return EXIT_SUCCESS ; 


Using signal() imposes important limitations what you are allowed to do inside the signal handlers, see the 
remarks for further information. 


POSIX recommends the usage of sigaction() instead of signal(), due to its underspecified behavior and 
significant implementation variations. POSIX also defines many more signals than ISO C standard, including 
SIGUSR1 and SIGUSR2, which can be used freely by the programmer for any purpose. 
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Chapter 32: Variable arguments 


Parameter Details 
va_list ap argument pointer, current position in the list of variadic arguments 


name of last non-variadic function argument, so the compiler finds the correct place to start 


last processing variadic arguments; may not be declared as a register variable, a function, or an array 
type 
type promoted type of the variadic argument to read (e.g. int fora short int argument) 


va_list src current argument pointer to copy 
va_list dst new argument list to be filled in 


Variable arguments are used by functions in the printf family (printf, fprintf, etc) and others to allow a function 
to be called with a different number of arguments each time, hence the name varargs. 


To implement functions using the variable arguments feature, use #include <stdarg.h>. 


To call functions which take a variable number of arguments, ensure there is a full prototype with the trailing 
ellipsis in scope: void err_exit(const char *format, ...); for example. 


Section 32.1: Using an explicit count argument to determine 
the length of the va_list 


With any variadic function, the function must know how to interpret the variable arguments list. With the printf () 
or scanf() functions, the format string tells the function what to expect. 


The simplest technique is to pass an explicit count of the other arguments (which are normally all the same type). 
This is demonstrated in the variadic function in the code below which calculates the sum of a series of integers, 
where there may be any number of integers but that count is specified as an argument prior to the variable 
argument list. 


#include <stdio.h> 
#include <stdarg.h> 


/* first arg is the number of following int args to sum. */ 
int sum(int n, ...) { 
int sum = ð; 
va_list it; /* hold information about the variadic argument list. */ 


va_start(it, n); /* start variadic argument processing */ 
while (n--) 

sum += va_arg(it, int); /* get and sum the next variadic argument */ 
va_end(it); /* end variadic argument processing */ 


return sum; 


} 

int main(void) 

{ 
printf("%d\n", sum(5, 1, 2, 3, 4, 5)); /* prints 15 */ 
printf("%d\n", sum(10, 5, 9, 2, 5, 111, 6666, 42, 1, 43, -6218)); /* prints 666 */ 
return ð; 

} 
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veg ial 32.2: Using terminator values to determine the end of 
va_list 


With any variadic function, the function must know how to interpret the variable arguments list. The “traditional” 
approach (exemplified by printf) is to specify number of arguments up front. However, this is not always a good 
idea: 


/*x First argument specifies the number of parameters; the remainder are also int */ 
extern int sum(int n, ...); 


/* But it's far from obvious from the code. */ 
sumar 2; 1 45-35 26) 


/* What happens if i.e. one argument is removed later on? */ 
sum(5, 2, 1, 3, 6) /* Disaster */ 


Sometimes it's more robust to add an explicit terminator, exemplified by the POSIX exec1p() function. Here's 
another function to calculate the sum of a series of double numbers: 


#include <stdarg.h> 
#include <stdio.h> 
#include <math.h> 


/* Sums args up until the terminator NAN */ 
double sum (double x, ...) { 

double sum = @; 

va_list va; 


va_start(va, x); 
for (; !isnan(x); x = va_arg(va, double)) { 
sum += x; 


} 


va_end(va) ; 


return sum; 


} 


int main (void) { 
printh(“%g\n, -sum(5., 227) I A 34,, 6 NANJ): 
printf("%g\n", sum(1, 0.5, @.25, @.125, 0.0625, 0.03125, NAN)); 
} 


Good terminator values: 


e integer (supposed to be all positive or non-negative) — 8 or -1 
floating point types — NAN 

e pointer types — NULL 

e enumerator types — some special value 


Section 32.3: Implementing functions with a printfQ -like 
interface 


One common use of variable-length argument lists is to implement functions that are a thin wrapper around the 
printf () family of functions. One such example is a set of error reporting functions. 


errmsg.h 
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#ifndef ERRMSG_H_INCLUDED 
#define ERRMSG_H_INCLUDED 


#include <stdarg.h> 
#include <stdnoreturn.h> Hf KE 


void verrmsg(int errnum, const char *fmt, va_list ap); 


noreturn void errmsg(int exitcode, int errnum, const char *fmt, ...); 
void warnmsg(int errnum, const char *fmt, ...); 
#endif 


This is a bare-bones example; such packages can be much elaborate. Normally, programmers will use either 
errmsg() or warnmsg(), which themselves use verrmsg() internally. If someone comes up with a need to do more, 
though, then the exposed verrmsg() function will be useful. You could avoid exposing it until you have a need for it 
(YAGNI — you aren't gonna need it), but the need will arise eventually (you are gonna need it — YAGNI). 


errmsg.c 


This code only needs to forward the variadic arguments to the vfprintf() function for outputting to standard 
error. It also reports the system error message corresponding to the system error number (errno) passed to the 
functions. 


#include "errmsg.h" 
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 


void 
verrmsg(int errnum, const char *fmt, va_list ap) 
{ 
if (fmt) 
vfprintf(stderr, fmt, ap); 
if (errnum != ð) 
fprintf(stderr, ": %s", strerror(errnum)) ; 
putc('\n', stderr); 
} 
void 
errmsg(int exitcode, int errnum, const char «fmt, ...) 
{ 
va_list ap; 
va_start(ap, fmt); 
verrmsg(errnum, fmt, ap); 
va_end(ap) ; 
exit(exitcode) ; 
} 
void 
warnmsg(int errnum, const char *fmt, ...) 
{ 
va_list ap; 
va_start(ap, fmt); 
verrmsg(errnum, fmt, ap); 
va_end(ap); 
} 


Using errmsg.h 
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Now you can use those functions as follows: 


#include "errmsg.h" 
#include <errno.h> 
#include <fcentl.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <unistd.h> 


int main(int argc, char **argv) 


{ 
char buffer [BUFSIZ] ; 
int fd; 
if (arge != 2) 
fprintf(stderr, "Usage: %s filename\n", argv[@]); 
exit (EXIT_FAILURE) ; 
} 
const char *filename = argv[1]; 
if ((fd = open(filename, O_RDONLY)) == -1) 
errmsg(EXIT_FAILURE, errno, "cannot open %s", filename) ; 
if (read(fd, buffer, sizeof(buffer)) != sizeof(buffer) ) 
errmsg(EXIT_FAILURE, errno, "cannot read %zu bytes from %s", sizeof(buffer), filename) ; 
if (close(fd) == -1) 
warnmsg(errno, "cannot close %s", filename); 
/* continue the program */ 
return @; 
} 


If either the open() or read() system calls fails, the error is written to standard error and the program exits with 
exit code 1. If the close() system call fails, the error is merely printed as a warning message, and the program 
continues. 


Checking the correct use of printf () formats 


If you are using GCC (the GNU C Compiler, which is part of the GNU Compiler Collection), or using Clang, then you 
can have the compiler check that the arguments you pass to the error message functions match what printf () 
expects. Since not all compilers support the extension, it needs to be compiled conditionally, which is a little bit 
fiddly. However, the protection it gives is worth the effort. 


First, we need to know how to detect that the compiler is GCC or Clang emulating GCC. The answer is that GCC 
defines __GNUC__ to indicate that. 


See common function attributes for information about the attributes — specifically the format attribute. 


Rewritten errmsg.h 


#ifndef ERRMSG_H_ INCLUDED 
#define ERRMSG_H_INCLUDED 


#include <stdarg.h> 
#include <stdnoreturn.h> Ha Gal 


#if !defined(PRINTFLIKE) 

#if defined(__GNUC__) 

#define PRINTFLIKE(n,m) __attribute__((format(printf,n,m) )) 
#else 

#define PRINTFLIKE(n,m) /* If only */ 
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#endif /* __GNUC__ */ 
#endif /* PRINTFLIKE x*/ 


void verrmsg(int errnum, const char *fmt, va_list ap); 


void noreturn errmsg(int exitcode, int errnum, const char *fmt, ...) 
PRINTFLIKE(3, 4); 
void warnmsg(int errnum, const char *fmt, ...) 


PRINTFLIKE(2, 3); 


#endif 

Now, if you make a mistake like: 

errmsg(EXIT_FAILURE, errno, "Failed to open file '%d' for reading", filename); 
(where the %d should be %s), then the compiler will complain: 


$ gcc -03 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes \ 
> -Wold-style-definition -c erruse.c 
erruse.c: In function ‘main’: 
erruse.c:20:64: error: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘const char 
x [-Werror=formate= | 
errmsg(EXIT_FAILURE, errno, "Failed to open file '%d' for reading", filename); 


P 
%S 
cc1: all warnings being treated as errors 


$ 


Section 32.4: Using a format string 


Using a format string provides information about the expected number and type of the subsequent variadic 
arguments in such a way as to avoid the need for an explicit count argument or a terminator value. 


The example below shows a a function that wraps the standard printf () function, only allowing for the use of 
variadic arguments of the type char, int and double (in decimal floating point format). Here, like with printf (), the 
first argument to the wrapping function is the format string. As the format string is parsed the function is able to 
determine if there is another variadic argument expected and what it's type should be. 


#include <stdio.h> 
#include <stdarg.h> 


int simple_printf(const char *format, ...) 


{ 
va_list ap; /* hold information about the variadic argument list. */ 
int printed = ð; /* count of printed characters */ 


va_start(ap, format); /* start variadic argument processing */ 


while (*format != '\@') /* read format string until string terminator */ 
{ 
int fh = 05 
if (*format == '%') 
{ 
++format; 
switch(*format) 
{ 
case 'c' 
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f = printf("%d", va_arg(ap, int)); /* print next variadic argument, note type 
promotion from char to int */ 


break; 
case od) 
f = printf("%d", va_arg(ap, int)); /* print next variadic argument */ 
break; 
case 'f' 
f = printf("%f", va_arg(ap, double)); /* print next variadic argument */ 
break; 
default 
f = -1; /* invalid format specifier */ 
break; 
} 
: 
else 
{ 
f = printf("%c", *format); /* print any other characters */ 
} 
if (f < @) /* check for errors */ 
{ 
printed = f; 
break; 
} 
else 
{ 
printed += f; 
} 


++format; /* move on to next character in string */ 


} 


va_end(ap); /* end variadic argument processing */ 


return printed; 


} 
int main (int argc, char *argv[]) 
{ 
int x = 40; 
int y = @; 
y = simple_printf("There are %d characters in this sentence", x); 
simple_printf("\n%d were printed\n", y); 
} 
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Chapter 33: Assertion 


Parameter Details 
expression expression of scalar type. 


message string literal to be included in the diagnostic message. 


An assertion is a predicate that the presented condition must be true at the moment the assertion is encountered 
by the software. Most common are simple assertions, which are validated at execution time. However, static 
assertions are checked at compile time. 


Section 33.1: Simple Assertion 


An assertion is a statement used to assert that a fact must be true when that line of code is reached. Assertions are 
useful for ensuring that expected conditions are met. When the condition passed to an assertion is true, there is no 
action. The behavior on false conditions depends on compiler flags. When assertions are enabled, a false input 
causes an immediate program halt. When they are disabled, no action is taken. It is common practice to enable 
assertions in internal and debug builds, and disable them in release builds, though assertions are often enabled in 
release. (Whether termination is better or worse than errors depends on the program.) Assertions should be used 
only to catch internal programming errors, which usually means being passed bad parameters. 


#include <stdio.h> 

/* Uncomment to disable `assert()` */ 
/* #define NDEBUG */ 

#include <assert.h> 


int main(void) 


{ 
int x = -1; 
assert(x >= @); 
print x= dini, X): 
return ð; 

} 


Possible output with NDEBUG undefined: 

a.out: main.c:9: main: Assertion `x >= 0' failed. 
Possible output with NDEBUG defined: 

x Soll 


It's good practice to define NDEBUG globally, so that you can easily compile your code with all assertions either on or 
off. An easy way to do this is define NDEBUG as an option to the compiler, or define it in a shared configuration 
header (e.g. config.h). 


Section 33.2: Static Assertion 


Version = C11 


Static assertions are used to check if a condition is true when the code is compiled. If it isn't, the compiler is 
required to issue an error message and stop the compiling process. 
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A static assertion is one that is checked at compile time, not run time. The condition must be a constant expression, 
and if false will result in a compiler error. The first argument, the condition that is checked, must be a constant 
expression, and the second a string literal. 


Unlike assert, _Static_assert is a keyword. A convenience macro static_assert is defined in <assert.h>. 


#include <assert.h> 


enum {N = 5}; 
_Static_assert(N == 5, "N does not equal 5"); 
static_assert(N > 10, "N is not greater than 10"); /* compiler error */ 


Version = C99 


Prior to C11, there was no direct support for static assertions. However, in C99, static assertions could be emulated 
with macros that would trigger a compilation failure if the compile time condition was false. Unlike _Static_assert, 
the second parameter needs to be a proper token name so that a variable name can be created with it. If the 
assertion fails, the variable name is seen in the compiler error, since that variable was used in a syntactically 
incorrect array declaration. 


#define STATIC_MSG(msg, 1) STATIC_MSG2(msg, 1) 

#define STATIC_MSG2(msg,1) on_line_##1##__##msg 

#define STATIC_ASSERT(x, msg) extern char STATIC_MSG(msg, __LINE__) [(x)?1:-1] 
enum { N = 5 }; 
STATIC_ASSERT(N == 5, N_must_equal_5) ; 

STATIC_ASSERT(N > 5, N_must_be_greater_than_5); /* compile error */ 


Before C99, you could not declare variables at arbitrary locations in a block, so you would have to be extremely 
cautious about using this macro, ensuring that it only appears where a variable declaration would be valid. 


Section 33.3: Assert Error Messages 
A trick exists that can display an error message along with an assertion. Normally, you would write code like this 


void f(void *p) 

{ 
assert(p != NULL); 
/* more code */ 


If the assertion failed, an error message would resemble 
Assertion failed: p != NULL, file main.c, line 5 


However, you can use logical AND (&&) to give an error message as well 


void f(void *p) 

{ 
assert(p != NULL && "function f: p cannot be NULL"); 
/* more code */ 


Now, if the assertion fails, an error message will read something like this 
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Assertion failed: p != NULL && "function f: p cannot be NULL", file main.c, line 5 


The reason as to why this works is that a string literal always evaluates to non-zero (true). Adding && 1 to a Boolean 
expression has no effect. Thus, adding && “error message" has no effect either, except that the compiler will 
display the entire expression that failed. 


Section 33.4: Assertion of Unreachable Code 


During development, when certain code paths must be prevented from the reach of control flow, you may use 
assert(®@) to indicate that such a condition is erroneous: 


switch (color) { 
case COLOR_RED: 
case COLOR_GREEN: 
case COLOR_BLUE: 
break; 


default: 
assert(@); 


Whenever the argument of the assert() macro evaluates false, the macro will write diagnostic information to the 
standard error stream and then abort the program. This information includes the file and line number of the 
assert() statement and can be very helpful in debugging. Asserts can be disabled by defining the macro NDEBUG. 


Another way to terminate a program when an error occurs are with the standard library functions exit, quick_exit 
or abort. exit and quick_exit take an argument that can be passed back to your environment. abort() (and thus 
assert) can be a really severe termination of your program, and certain cleanups that would otherwise be 
performed at the end of the execution, may not be performed. 


The primary advantage of assert() is that it automatically prints debugging information. Calling abort() has the 
advantage that it cannot be disabled like an assert, but it may not cause any debugging information to be displayed. 
In some situations, using both constructs together may be beneficial: 


if (color == COLOR_RED || color == COLOR_GREEN) { 
} else if (color == COLOR_BLUE) { 
} else { 
assert(@), abort(); 
} 


When asserts are enabled, the assert() call will print debug information and terminate the program. Execution 
never reaches the abort() call. When asserts are disabled, the assert() call does nothing and abort() is called. 
This ensures that the program always terminates for this error condition; enabling and disabling asserts only effects 
whether or not debug output is printed. 


You should never leave such an assert in production code, because the debug information is not helpful for end 
users and because abort is generally a much too severe termination that inhibit cleanup handlers that are installed 
for exit or quick_exit to run. 


Section 33.5: Precondition and Postcondition 


One use case for assertion is precondition and postcondition. This can be very useful to maintain invariant and 


Goalkicker.com - C Notes for Professionals 209 


design by contract. For a example a length is always zero or positive so this function must return a zero or positive 
value. 


#include <stdio.h> 

/* Uncomment to disable ‘assert()* */ 
/* #define NDEBUG */ 

#include <assert.h> 


int length2 (int *a, int count) 
{ 


int i, result = @; 


/* Precondition: */ 

/* NULL is an invalid vector */ 

assert (a != NULL); 

/* Number of dimensions can not be negative. */ 
assert (count >= @); 


/* Calculation */ 
for (i = 0; i < count; ++i) 
{ 
result = result + (aļi] x alil); 


} 


/* Postcondition: */ 

/* Resulting length can not be negative. */ 
assert (result >= @); 

return result; 


} 


#define COUNT 3 


int main (void) 


{ 
int a[COUNT] = {1, 2, 3}; 
int *b = NULL; 
int r; 
r = length2 (a, COUNT); 
pranth (ir = 4iNn 4 n) 
r = length2 (b, COUNT); 
printi (ir = Zina D); 
return ð; 

} 
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Chapter 34: Generic selection 


Parameter Details 
generic-assoc-list_ generic-association OR generic-assoc-list , generic-association 


generic-association type-name : assignment-expression OR default : assignment-expression 


Section 34.1: Check whether a variable is of a certain qualified 
type 


#include <stdio.h> 


#define is_const_int(x) _Generic((&x), \ 


const int *: "a const int", \ 
ING kE "a non-const int", N 
default: "of other type") 


int main(void) 

{ 
const int i = 1; 
int j = 1; 
double k = 1.0; 
printf("i is %s\n", is_const_int(i)); 
printf("j is %s\n", is_const_int(j)); 
printf("k is %s\n", is_const_int(k)); 


Output: 


i is a const int 
j is a non-const int 
k is of other type 


However, if the type generic macro is implemented like this: 


#define is_const_int(x) _Generic((x), \ 


Const ants sa const Inti, \ 
TMG: "a non-const int”, \ 
default: "of other type") 


The output is: 


i is a non-const int 
j is a non-const int 
k is of other type 


This is because all type qualifiers are dropped for the evaluation of the controlling expression of a -Generic 
primary expression. 


Section 34.2: Generic selection based on multiple arguments 


If a selection on multiple arguments for a type generic expression is wanted, and all types in question are 
arithmetic types, an easy way to avoid nested _Generic expressions is to use addition of the parameters in the 
controlling expression: 
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int max_int(int, int); 
unsigned max_unsigned(unsigned, unsigned) ; 
double max_double(double, double); 


#define MAX(X, Y) _Generic((X)+(Y), 
Imite max_int, 
unsigned: max_unsigned, 
default: max_double) 
((X), (Y)) 


\ 
\ 
\ 
\ 


Here, the controlling expression (X)+(Y) is only inspected according to its type and not evaluated. The usual 
conversions for arithmetic operands are performed to determine the selected type. 


For more complex situation, a selection can be made based on more than one argument to the operator, by nesting 
them together. 


This example selects between four externally implemented functions, that take combinations of two int and/or 
string arguments, and return their sum. 


int AddIntInt 
int AddIntStr 
int AddStrint 
int AddStrStr 


int a, int b); 

int a, const char* b); 

const chars a, int b ); 

const char* a, const char» b); 


Fam em pm omen, 


#define AddStr(y) \ 
_Generic((y), int: AddStrint, \ 
charx: AddStrStr, Ñ 


const chars: AddStrStr ) 


#define AddInt(y) \ 
_Generic((y), int: AddIntInt, \ 
char*: AddIntStr, \ 


const charx: AddIntStr_ ) 


#define Add(x, 
_Generic((x) , int: AddInt(y) , 
char*: AddStr(y) , 
const char*: AddStr(y) ) 
((x); {y)) 


ee ere ee 


int main( void ) 


{ 
int result = @; 
result = Add( 100 , 999 ); 
result = Add( 100 , "999" ); 
result = Add( "100" , 999 ); 
result = Add( "100" , "999" ); 
const int a = -123; 
char bi = 84321 
result = Add( a , b ); 
int c = 1; 
const char d[] = "0"; 
result = Add( d , ++c ); 

} 


Even though it appears as if argument y is evaluated more than once, it isn't 1. Both arguments are evaluated only 
once, at the end of macro Add: ( x , y ), just like in an ordinary function call. 
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1 (Quoted from: ISO:IEC 9899:201X 6.5.1.1 Generic selection 3) 
The controlling expression of a generic selection is not evaluated. 


Section 34.3: Type-generic printing macro 


#include <stdio.h> 


void print_int(int x) { printf("int: %d\n", x); } 
void print_dbl(double x) { printf("double: %g\n", x); } 
void print_default() { puts("unknown argument"); } 


#define print(X) _Generic((X), \ 
ait a Diet iteealmib N 
double: print_dbl, \ 
default: print_default) (XxX) 


int main(void) { 
print(42) ; 
print(3.14); 
print("hello, world"); 


Output: 


int: 42 
double: 3.14 
unknown argument 


Note that if the type is neither int nor double, a warning would be generated. To eliminate the warning, you can 


add that type to the print(X) macro. 
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Chapter 35: X-macros 


X-macros are a preprocessor-based technique for minimizing repetitious code and maintaining data / code 
correspondences. Multiple distinct macro expansions based on a common set of data are supported by 
representing the whole group of expansions via a single master macro, with that macro's replacement text 
consisting of a sequence of expansions of an inner macro, one for each datum. The inner macro is traditionally 
named X(), hence the name of the technique. 


Section 35.1: Trivial use of X-macros for printfs 


/* define a list of preprocessor tokens on which to call X */ 
#define X_123 X(1) X(2) X(3) 


/*x define X to use */ 

#define X(val) printf("X(%d) made this print\n", val); 
X_123 

#undef X 

/* good practice to undef X to facilitate reuse later on */ 


This example will result in the preprocessor generating the following code: 


printf("X(%d) made this print\n", 1); 
printf("X(%d) made this print\n", 2); 
printf("X(%d) made this print\n", 3); 


Section 35.2: Extension: Give the X macro as an argument 


The X-macro approach can be generalized a bit by making the name of the "X" macro an argument of the master 
macro. This has the advantages of helping to avoid macro name collisions and of allowing use of a general-purpose 
macro as the "X" macro. 


As always with X macros, the master macro represents a list of items whose significance is specific to that macro. In 
this variation, such a macro might be defined like so: 


/* declare list of items */ 
#define ITEM_LIST(X) \ 
X(item1) \ 
X(item2) \ 
X(item3) \ 
/* end of list */ 


One might then generate code to print the item names like so: 


/*x define macro to apply */ 
#define PRINTSTRING(value) printf( #value "\n"); 


/* apply macro to the list of items */ 
ITEM_LIST(PRINTSTRING) 


That expands to this code: 
printf( “itemi” "\n"); printf( "“item2" "\n"); printf( "item3" "\n"); 


In contrast to standard X macros, where the "X" name is a built-in characteristic of the master macro, with this style 
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it may be unnecessary or even undesirable to afterward undefine the macro used as the argument (PRINTSTRING in 


this example). 


Section 35.3: Enum Value and Identifier 


/* declare items of the enum */ 
#define FOREACH \ 

X(item1) \ 

X(item2) \ 

X(item3) \ 
/* end of list */ 


/* define the enum values */ 
#define X(id) MyEnum_ ## id, 
enum MyEnum { FOREACH }; 
#undef X 


/* convert an enum value to its identifier */ 
const char * enum2string(int enumValue) 


{ 
const char» stringValue = NULL; 

#define X(id) if (enumValue == MyEnum_ ## id) stringValue = #id; 
FOREACH 

#undef X 
return stringValue; 


Next you can use the enumerated value in your code and easily print its identifier using : 


printf("%s\n", enum2string(MyEnum_item2) ) ; 


Section 35.4: Code generation 


X-Macros can be used for code generation, by writing repetitive code: iterate over a list to do some tasks, or to 
declare a set of constants, objects or functions. 


Here we use X-macros to declare an enum containing 4 commands and a map of their names as strings 


Then we can print the string values of the enum. 


/* All our commands */ 
#define COMMANDS(OP) OP(Open) OP(Close) OP(Save) OP(Quit) 


/* generate the enum Commands: {cmdOpen, cmdClose, cmdSave, cmdQuit, }; */ 
#define ENUM_NAME(name) cmd##name, 

enum Commands { 

COMMANDS (ENUM_NAME) 


}; 
#undef ENUM_NAME 


/* generate the string table */ 
#define COMMAND_OP(name) #name, 
const char» const commandNames[] = { 
COMMANDS ( COMMAND_OP) 


}; 
#undef COMMAND_OP 


/*x the following prints "Quit\n": */ 
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printf("%s\n", commandNames[cmdQuit]()); 


Similarly, we can generate a jump table to call functions by the enum value. 


This requires all functions to have the same signature. If they take no arguments and return an int, we would put 


this in a header with the enum definition: 


/*x declare all functions as extern */ 

#define EXTERN_FUNC(name) extern int doCmd##name(void) ; 
COMMANDS ( EXTERN_FUNC) 

#undef EXTERN_FUNC 


/x declare the function pointer type and the jump table */ 
typedef int (*CommandFunc) (void) ; 
extern CommandFunc commandJumpTable[ ] ; 


All of the following can be in different compilation units assuming the part above is included as a header: 


/* generate the jump table */ 
#define FUNC_NAME(name) doCmd##name, 
CommandFunc commandJumpTable[] = { 
COMMANDS (FUNC_NAME) 


}; 
#undef FUNC_NAME 


/* call the save command like this: */ 
int result = commandJumpTable[cmdSave](); 


/* somewhere else, we need the implementations of the commands */ 
int doCmdOpen(void) {/* code performing open command */} 

int doCmdClose(void) {/* code performing close command */} 

int doCmdSave(void) {/* code performing save command */} 

int doCmdQuit(void) {/* code performing quit command */} 


An example of this technique being used in real code is for GPU command dispatching in Chromium. 
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Chapter 36: Aliasing and effective type 
Section 36.1: Effective type 


The effective type of a data object is the last type information that was associated with it, if any. 


// a normal variable, effective type uint32_t, and this type never changes 
uint32_t a = 0.0; 


// effective type of *pa is uint32_t, too, simply 
// because *pa is the object a 
uint32_t* pa = &a; 


// the object pointed to by q has no effective type, yet 

void* q = malloc(sizeof uint32_t); 

// the object pointed to by q still has no effective type, 

// because nobody has written to it 

uint32_t* qb = q; 

// *qb now has effective type uint32_t because a uint32_t value was written 
xqb = 37; 


// the object pointed to by r has no effective type, yet, although 
// it is initialized 

void* r = calloc(1, sizeof uint32_t); 

// the object pointed to by r still has no effective type, 

// because nobody has written to or read from it 

uint32_t* re = r; 

// *rc now has effective type uint32_t because a value is read 

// from it with that type. The read operation is valid because we used calloc. 
// Now the object pointed to by r (which is the same as *rc) has 
// gained an effective type, although we didn't change its value. 
uint32_t c = *rc; 


// the object pointed to by s has no effective type, yet. 
void* s = malloc(sizeof uint32_t); 

// the object pointed to by s now has effective type uint32_t 
// because an uint32_t value is copied into it. 

memcpy(s, r, sizeof uint32_t); 


Observe that for the latter, it was not necessary that we even have an uint32_t* pointer to that object. The fact 
that we have copied another uint32_t object is sufficient. 


Section 36.2: restrict qualification 


If we have two pointer arguments of the same type, the compiler can't make any assumption and will always have 
to assume that the change to xe may change *f: 


void fun(float* e, float* f) { 
float a = *f 
xe = 22; 
float b = *f; 
print("is %g equal to %g?\n", a, b); 


} 
float fval = 4; 


float eval = 77; 
fun(&eval, &fval); 
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all goes well and something like 
is 4 equal to 4? 

is printed. If we pass the same pointer, the program will still do the right thing and print 
is 4 equal to 22? 


This can turn out to be inefficient, if we know by some outside information that e and f will never point to the same 
data object. We can reflect that knowledge by adding restrict qualifiers to the pointer parameters: 


void fan(float*restrict e, float*restrict f) { 
float a = +f 
xe = 22; 
float b = xf: 
print("is %g equal to %g?\n", a, b); 


Then the compiler may always suppose that e and f point to different objects. 


Section 36.3: Changing bytes 


Once an object has an effective type, you should not attempt to modify it through a pointer of another type, unless 
that other type is a character type, char, signed char or unsigned char. 


#include <inttypes.h> 
#include <stdio.h> 


int main(void) { 
uint32_t a = 57; 
// conversion from incompatible types needs a cast ! 
unsigned char* ap = (unsigned char*)&a; 
for (size_t i = 0; i < sizeof a; ++i) { 
/* set each byte of a to 42 */ 
ap[i] = 42; 
} 


printf("a now has value %" PRIu32 "\n", a); 


This is a valid program that prints 
a now has value 707406378 


This works because: 


e The access is made to the individual bytes seen with type unsigned char so each modification is well 
defined. 

e The two views to the object, through a and through «ap, alias, but since ap is a pointer to a character type, the 
strict aliasing rule does not apply. Thus the compiler has to assume that the value of a may have been 
changed in the for loop. The modified value of a must be constructed from the bytes that have been 
changed. 

e The type of a, uint32_t has no padding bits. All its bits of the representation count for the value, here 
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787406378, and there can be no trap representation. 


Section 36.4: Character types cannot be accessed through 
non-character types 


If an object is defined with static, thread, or automatic storage duration, and it has a character type, either: char, 
unsigned char, or signed char, it may not be accessed by a non-character type. In the below example a char array 
is reinterpreted as the type int, and the behavior is undefined on every dereference of the int pointer b. 


int main( void ) 
{ 
char a[100]; 


int» b = ( int* )&a; 
*D =o 


static char c[100]; 
b = ( int* )&c; 
*b = 2e 


_Thread_local char d[100]; 
b = ( int* )&d; 
*b = 3; 


This is undefined because it violates the "effective type" rule, no data object that has an effective type may be 
accessed through another type that is not a character type. Since the other type here is int, this is not allowed. 


Even if alignment and pointer sizes would be known to fit, this would not exempt from this rule, behavior would still 
be undefined. 


This means in particular that there is no way in standard C to reserve a buffer object of character type that can be 
used through pointers with different types, as you would use a buffer that was received by malloc or similar 
function. 


A correct way to achieve the same goal as in the above example would be to use a union. 


typedef union bufType bufType; 
union bufType { 
char c[sizeof(int[25])]; 


alan atlas |e 
ie 
int main( void ) 
{ 
bufType a = { .c = { @ } }; // reserve a buffer and initialize 
ints b = a.i; // no cast necessary 
*b = aie 
static bufType a = { .c={0}}; 
ints b = a-i; 
*b = 2s 
_Thread_local bufType a = { .c = { @ } }; 
ines b= asa: 
Elo} = 3; 
} 


Here, the union ensures that the compiler knows from the start that the buffer could be accessed through different 
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views. This also has the advantage that now the buffer has a "view" a.i that already is of type int and no pointer 
conversion is needed. 


Section 36.5: Violating the strict aliasing rules 
In the following code let us assume for simplicity that float and uint32_t have the same size. 


void fun(uint32_t* u, float* f) { 


float a = xf 
xu = 22; 
float b = sf: 


print("%g should equal %g\n", a, b); 


u and f have different base type, and thus the compiler can assume that they point to different objects. There is no 
possibility that xf could have changed between the two initializations of a and b, and so the compiler may optimize 
the code to something equivalent to 


void fun(uint32_t* u, float* f) { 
float a = +f 
RU = 22°: 
print("%g should equal %g\n", a, a); 


That is, the second load operation of *f can be optimized out completely. 


If we call this function "normally" 


float fyal = 4: 
uint32_t uval = 77; 
fun(&uval, &fval); 


all goes well and something like 
4 should equal 4 


is printed. But if we cheat and pass the same pointer, after converting it, 


float fval = 
uint32_t* up 


= (uint32_t*)&fval; 
fun(up, &fval); 


we violate the strict aliasing rule. Then the behavior becomes undefined. The output could be as above, if the 
compiler had optimized the second access, or something completely different, and so your program ends up ina 
completely unreliable state. 
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Chapter 37: Compilation 


The C language is traditionally a compiled language (as opposed to interpreted). The C Standard defines translation 
phases, and the product of applying them is a program image (or compiled program). In c11, the phases are listed 
in §5.1.1.2. 


Section 37.1: The Compiler 


After the C pre-processor has included all the header files and expanded all macros, the compiler can compile the 
program. It does this by turning the C source code into an object code file, which is a file ending in .o which 
contains the binary version of the source code. Object code is not directly executable, though. In order to make an 
executable, you also have to add code for all of the library functions that were #inc1luded into the file (this is not the 
same as including the declarations, which is what #include does). This is the job of the linker. 


In general, the exact sequence how to invoke a C compiler depends much on the system that you are using. Here 
we are using the GCC compiler, though it should be noted that many more compilers exist: 


% gcc -Wall -c foo.c 


% is the OS' command prompt. This tells the compiler to run the pre-processor on the file foo.c and then compile it 
into the object code file foo.o. The -c option means to compile the source code file into an object file but not to 
invoke the linker. This option -c is available on POSIX systems, such as Linux or macOS; other systems may use 
different syntax. 


If your entire program is in one source code file, you can instead do this: 
% gcc -Wall foo.c -o foo 


This tells the compiler to run the pre-processor on foo.c, compile it and then link it to create an executable called 
foo. The -o option states that the next word on the line is the name of the binary executable file (program). If you 
don't specify the -o, (if you just type gcc foo.c), the executable will be named a. out for historical reasons. 


In general the compiler takes four steps when converting a .c file into an executable: 


1. pre-processing - textually expands #include directives and #define macros in your .c file 

2. compilation - converts the program into assembly (you can stop the compiler at this step by adding the -S 
option) 

3. assembly - converts the assembly into machine code 

4. linkage - links the object code to external libraries to create an executable 


Note also that the name of the compiler we are using is GCC, which stands for both "GNU C compiler" and "GNU 
compiler collection", depending on context. Other C compilers exist. For Unix-like operating systems, many of them 
have the name cc, for "C compiler", which is often a symbolic link to some other compiler. On Linux systems, cc is 
often an alias for GCC. On macOS or OS-X, it points to clang. 


The POSIX standards currently mandates c99 as the name of a C compiler — it supports the C99 standard by 
default. Earlier versions of POSIX mandated c89 as the compiler. POSIX also mandates that this compiler 
understands the options -c and -o that we used above. 


Note: The -Wall option present in both gcc examples tells the compiler to print warnings about questionable 
constructions, which is strongly recommended. It is a also good idea to add other warning options, e.g. -Wextra. 
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Section 37.2: File Types 
Compiling C programs requires you to work with five kinds of files: 


1. Source files: These files contain function definitions, and have names which end in .c by convention. Note: 
.cc and .cpp are C++ files; not C files. 
e.g., Foo.c 


2. Header files: These files contain function prototypes and various pre-processor statements (see below). 
They are used to allow source code files to access externally-defined functions. Header files end in .h by 
convention. 

e.g., foo.h 


3. Object files: These files are produced as the output of the compiler. They consist of function definitions in 
binary form, but they are not executable by themselves. Object files end in .o by convention, although on 
some operating systems (e.g. Windows, MS-DOS), they often end in .obj. 

e.g., foo.o foo.obj 


4. Binary executables: These are produced as the output of a program called a "linker". The linker links 
together a number of object files to produce a binary file which can be directly executed. Binary executables 
have no special suffix on Unix operating systems, although they generally end in .exe on Windows. 


e.g., foo foo.exe 


5. Libraries: A library is a compiled binary but is not in itself an an executable (i.e., there is no main() function 
in a library). A library contains functions that may be used by more than one program. A library should ship 
with header files which contain prototypes for all functions in the library; these header files should be 
referenced (e.g; #include <library.h>)in any source file that uses the library. The linker then needs to be 
referred to the library so the program can successfully compiled. There are two types of libraries: static and 
dynamic. 


o Static library: A static library (.a files for POSIX systems and .1ib files for Windows — not to be 
confused with DLL import library files, which also use the .1ib extension) is statically built into the 
program . Static libraries have the advantage that the program knows exactly which version of a library 
is used. On the other hand, the sizes of executables are bigger as all used library functions are 
included. 

e.g., libfoo.a foo.lib 

Dynamic library: A dynamic library (. so files for most POSIX systems, .dylib for OSX and .d11 files 
for Windows) is dynamically linked at runtime by the program. These are also sometimes referred to 
as shared libraries because one library image can be shared by many programs. Dynamic libraries 
have the advantage of taking up less disk space if more than one application is using the library. Also, 
they allow library updates (bug fixes) without having to rebuild executables. 

e.g., foo.so foo.dylib foo.d1l 


[e) 


Section 37.3: The Linker 


The job of the linker is to link together a bunch of object files (.0 files) into a binary executable. The process of 
linking mainly involves resolving symbolic addresses to numerical addresses. The result of the link process is normally 
an executable program. 


During the link process, the linker will pick up all the object modules specified on the command line, add some 
system-specific startup code in front and try to resolve all external references in the object module with external 
definitions in other object files (object files can be specified directly on the command line or may implicitly be added 
through libraries). It will then assign load addresses for the object files, that is, it specifies where the code and data 
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will end up in the address space of the finished program. Once it's got the load addresses, it can replace all the 
symbolic addresses in the object code with "real", numerical addresses in the target's address space. The program 
is ready to be executed now. 


This includes both the object files that the compiler created from your source code files as well as object files that 
have been pre-compiled for you and collected into library files. These files have names which end in .a or .so, and 
you normally don't need to know about them, as the linker knows where most of them are located and will link 
them in automatically as needed. 


Implicit invocation of the linker 


Like the pre-processor, the linker is a separate program, often called 1d (but Linux uses collect2, for example). 
Also like the pre-processor, the linker is invoked automatically for you when you use the compiler. Thus, the normal 
way of using the linker is as follows: 


% gcc foo.o bar.o baz.o -o myprog 


This line tells the compiler to link together three object files (foo.o, bar .o, and baz.o) into a binary executable file 
named myprog. Now you have a file called myprog that you can run and which will hopefully do something cool 
and/or useful. 


Explicit invocation of the linker 


It is possible to invoke the linker directly, but this is seldom advisable, and is typically very platform-specific. That is, 
options that work on Linux won't necessarily work on Solaris, AIX, macOS, Windows, and similarly for any other 
platform. If you work with GCC, you can use gcc -v to see what is executed on your behalf. 


Options for the linker 


The linker also takes some arguments to modify it's behavior. The following command would tell gcc to link foo.o 
and bar .o, but also include the ncurses library. 


% gcc foo.o bar.o -o foo -Incurses 
This is actually (more or less) equivalent to 
% gcc foo.o bar.o /usr/lib/libncurses.so -o foo 


(although libncurses.so could be libncurses.a, which is just an archive created with ar). Note that you should list 
the libraries (either by pathname or via -1name options) after the object files. With static libraries, the order that 
they are specified matters; often, with shared libraries, the order doesn't matter. 


Note that on many systems, if you are using mathematical functions (from <math.h>), you need to specify -1m to 
load the mathematics library — but Mac OS X and macOS Sierra do not require this. There are other libraries that 
are separate libraries on Linux and other Unix systems, but not on macOS — POSIX threads, and POSIX realtime, 
and networking libraries are examples. Consequently, the linking process varies between platforms. 


Other compilation options 


This is all you need to know to begin compiling your own C programs. Generally, we also recommend that you use 
the -Wall command-line option: 


% gcc -Wall -c foo.cc 
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The -Wall option causes the compiler to warn you about legal but dubious code constructs, and will help you catch 
a lot of bugs very early. 


If you want the compiler to throw more warnings at you (including variables that are declared but not used, 
forgetting to return a value etc.), you can use this set of options, as -Wa11, despite the name, doesn't turn all of the 
possible warnings on: 


% gcc -Wall -Wextra -Wfloat-equal -Wundef -Wcast-align -Wwrite-strings -Wlogical-op \ 
> -Wmissing-declarations -Wredundant-decls -Wshadow ... 


Note that clang has an option -Weverything which really does turn on all warnings in clang. 


Section 37.4: The Preprocessor 


Before the C compiler starts compiling a source code file, the file is processed in a preprocessing phase. This phase 
can be done by a separate program or be completely integrated in one executable. In any case, it is invoked 
automatically by the compiler before compilation proper begins. The preprocessing phase converts your source 
code into another source code or translation unit by applying textual replacements. You can think of it as a 
"modified" or "expanded" source code. That expanded source may exist as a real file in the file system, or it may 
only be stored in memory for a short time before being processed further. 


Preprocessor commands start with the pound sign ("#"). There are several preprocessor commands; two of the 
most important are: 


1. Defines: 


#define is mainly used to define constants. For instance, 


#define BIGNUM 1000000 
int a = BIGNUM; 


becomes 


int a = 1000000 ; 


#define is used in this way so as to avoid having to explicitly write out some constant value in many different 
places in a source code file. This is important in case you need to change the constant value later on; it's 
much less bug-prone to change it once, in the #define, than to have to change it in multiple places scattered 
all over the code. 


Because #define just does advanced search and replace, you can also declare macros. For instance: 


#define ISTRUE(stm) do{stm = stm ? 1 : 9; }while(@) 
// in the function: 

a=xX; 

ISTRUE(a) ; 


becomes: 


// in the function: 


a = xX; 
do { 

By Sh a) 3 eke 
} while(@) ; 
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At first approximation, this effect is roughly the same as with inline functions, but the preprocessor doesn't 
provide type checking for #define macros. This is well known to be error-prone and their use necessitates 
great caution. 


Also note here, that the preprocessor would also replace comments with a blanks as explained below. 
2. Includes: 


#include is used to access function definitions defined outside of a source code file. For instance: 


#include <stdio.h> 


causes the preprocessor to paste the contents of <stdio.h> into the source code file at the location of the 
#include statement before it gets compiled. #include is almost always used to include header files, which 
are files which mainly contain function declarations and #define statements. In this case, we use #include in 
order to be able to use functions such as printf and scanf, whose declarations are located in the file 
stdio.h. C compilers do not allow you to use a function unless it has previously been declared or defined in 
that file; #include statements are thus the way to re-use previously-written code in your C programs. 


3. Logic operations: 


#if defined A || defined B 
variable = another_variable + 1; 
#else 

variable = another_variable * 2; 
#endif 


will be changed to: 


variable = another_variable + 1; 


if A or B were defined somewhere in the project before. If this is not the case, of course the preprocessor will 
do this: 


variable = another_variable * 2; 
This is often used for code, that runs on different systems or compiles on different compilers. Since there are 


global defines, that are compiler/system specific you can test on those defines and always let the compiler 
just use the code he will compile for sure. 


4. Comments 


The Preprocessor replaces all comments in the source file by single spaces. Comments are indicated by // up 
to the end of the line, or a combination of opening /* and closing «/ comment brackets. 


Section 37.5: The Translation Phases 


As of the C 2011 Standard, listed in §5.1.1.2 Translation Phases, the translation of source code to program image 
(e.g., the executable) are listed to occur in 8 ordered steps. 
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BR WN > 


6. 
7. 
8. 


. The source file input is mapped to the source character set (if necessary). Trigraphs are replaced in this step. 
. Continuation lines (lines that end with \) are spliced with the next line. 

. The source code is parsed into whitespace and preprocessing tokens. 

. The preprocessor is applied, which executes directives, expands macros, and applies pragmas. Each source 


file pulled in by #include undergoes translation phases 1 through 4 (recursively if necessary). All 
preprocessor related directives are then deleted. 


. Source character set values in character constants and string literals are mapped to the execution character 


set. 

String literals adjacent to each other are concatenated. 

The source code is parsed into tokens, which comprise the translation unit. 
External references are resolved, and the program image is formed. 


An implementation of a C compiler may combine several steps together, but the resulting image must still behave 
as if the above steps had occurred separately in the order listed above. 


Goalkicker.com - C Notes for Professionals 226 


Chapter 38: Inline assembly 


Section 38.1: gcc Inline assembly in macros 


We can put assembly instructions inside a macro and use the macro like you would call a function. 


#define mov(x,y) \ 
N 
asm__ ("L.cmov %0,%1,%2" : "=r" (x) : "r" (y), "r" (0X0000000 FN) \ 


} 


/// some definition and assignment 
unsigned char sbox[size][size]; 
unsigned char sbox[size][size]; 


///Using 
mov(state[@][1], sbox[si][sj]); 


Using inline assembly instructions embedded in C code can improve the run time of a program. This is very helpful 
in time critical situations like cryptographic algorithms such as AES. For example, for a simple shift operation that is 
needed in the AES algorithm, we can substitute a direct Rotate Right assembly instruction with C shift operator >>. 


In an implementation of 'AES256', in 'AddRoundKey()' function we have some statements like this: 


unsigned int w; Lf 325DIE 
unsigned char subkey[4]; // 8-bit, 4*8 = 32 


subkey[@] = w >> 24; // hold 8 bit, MSB, leftmost group of 8-bits 
subkey[1] = w >> 16; // hold 8 bit, second group of 8-bit from left 
subkey[2] = w >> 8; // hold 8 bit, second group of 8-bit from right 
subkey[3] = w; // hold 8 bit, LSB, rightmost group of 8-bits 


/// subkey <- w 

They simply assign the bit value of w to subkey array. 

We can change three shift + assign and one assign C expression with only one assembly Rotate Right operation. 
asm__ ("l.ror %0,%1,%2" : "=r" (* (unsigned int *) subkey) : "r" (w), "r" (@x10)); 


The final result is exactly same. 


Section 38.2: gcc Basic asm support 
Basic assembly support with gcc has the following syntax: 


asm [ volatile | ( AssemblerInstructions ) 


where AssemblerInstructions is the direct assembly code for the given processor. The volatile keyword is optional 
and has no effect as gcc does not optimize code within a basic asm statement. AssemblerInstructions can contain 
multiple assembly instructions. A basic asm statement is used if you have an asm routine that must exist outside of 
a C function. The following example is from the GCC manual: 


/* Note that this code will not compile with -masm=intel */ 
#define DebugBreak() asm("int $3") 
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In this example, you could then use DebugBreak() in other places in your code and it will execute the assembly 
instruction int $3. Note that even though gcc will not modify any code in a basic asm statement, the optimizer may 
still move consecutive asm statements around. If you have multiple assembly instructions that must occur ina 
specific order, include them in one asm statement. 


Section 38.3: gcc Extended asm support 


Extended asm support in gcc has the following syntax: 


asm [volatile] ( AssemblerTemplate 
: OutputOperands 
| : InputOperands 
[| : Clobbers ] ]) 


asm [volatile] goto ( AssemblerTemplate 


: InputOperands 
: Clobbers 
: GotoLabels) 


where AssemblerTemplate is the template for the assembler instruction, OutputOperands are any C variables that 
can be modified by the assembly code, InputOperands are any C variables used as input parameters, Clobbers are 
a list or registers that are modified by the assembly code, and GotoLabels are any goto statement labels that may 
be used in the assembly code. 


The extended format is used within C functions and is the more typical usage of inline assembly. Below is an 
example from the Linux kernel for byte swapping 16-bit and 32-bit numbers for an ARM processor: 


/* From arch/arm/include/asm/swab.h in Linux kernel version 4.6.4 */ 
#if __LINUX_ARM_ARCH__ >= 6 


static inline __attribute_const__ __u32 __arch_swahb32(__u32 x) 
{ 

asme (revi %0 wiles = Er ih | (oe) 

return x; 
} 


#define __arch_swahb32 __arch_swahb32 


#define __arch_swab16(x) ((__u16)__arch_swahb32(x) ) 


static inline __attribute_const__ __u32 __arch_swab32(__u32 x) 
{ 

See (Grev %0, ay i (a eee 9(0) \he 

return x; 
} 


#define __arch_swab32 __arch_swab32 


#endif 


Each asm section uses the variable x as its input and output parameter. The C function then returns the 
manipulated result. 


With the extended asm format, gcc may optimize the assembly instructions in an asm block following the same 
rules it uses for optimizing C code. If you want your asm section to remain untouched, use the volatile keyword 
for the asm section. 


Goalkicker.com - C Notes for Professionals 228 


Chapter 39: Identifier Scope 


Section 39.1: Function Prototype Scope 


#include <stdio.h> 

/x The parameter name, apple, has function prototype scope. These names 
are not significant outside the prototype itself. This is demonstrated 
below. */ 


int test_function(int apple) ; 


int main(void) 


{ 
int orange = 5; 
orange = test_function(orange) ; 
printf("%d\r\n", orange); //orange = 6 
return @; 

} 

int test_function(int fruit) 

{ 
fruit += 1; 
return fruit; 

} 


Note that you get puzzling error messages if you introduce a type name in a prototype: 


int function(struct whatever *arg); 


struct whatever 


{ 
int a; 
fee oes 
y; 
int function(struct whatever *arg) 
{ 
return arg->a; 
} 


With GCC 6.3.0, this code (source file dc11.c) produces: 


$ gcc -03 -g -std=c11 -Wall -Wextra -Werror -c dc11.c 
dc11.c:1:25: error: ‘struct whatever’ declared inside parameter list will not be visible outside of 
this definition or declaration [-Werror] 

int function(struct whatever *arg) ; 

Lene 

dc11.c:9:9: error: conflicting types for ‘function’ 

int function(struct whatever arg) 

Neen 

dc11.c:1:9: note: previous declaration of ‘function’ was here 

int function(struct whatever arg) ; 


RnR 


cc1: all warnings being treated as errors 
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Place the structure definition before the function declaration, or add struct whatever; asa line before the 
function declaration, and there is no problem. You should not introduce new type names in a function prototype 
because there's no way to use that type, and hence no way to define or use that function. 


Section 39.2: Block Scope 


An identifier has block scope if its corresponding declaration appears inside a block (parameter declaration in 
function definition apply). The scope ends at the end of the corresponding block. 


No different entities with the same identifier can have the same scope, but scopes may overlap. In case of 
overlapping scopes the only visible one is the one declared in the innermost scope. 


#include <stdio.h> 


void test(int bar) // bar has scope test function block 
{ 
int foo = 5; // foo has scope test function block 
{ 
int bar = 10; // bar has scope inner block, this overlaps with previous 


test:bar declaration, and it hides test:bar 
printf("%d %d\n", foo, bar); // 5 10 


} // end of scope for inner bar 
printf("%d %d\n", foo, bar); // 5 5, here bar is test:bar 
} // end of scope for test:foo and test:bar 


int main(void) 


{ 
int foo = 3; // foo has scope main function block 
printf("%d\n", foo); // 3 
test(5); 
printf("%d\n", foo); // 3 
return ð; 
} // end of scope for main:foo 


Section 39.3: File Scope 


#include <stdio.h> 


/* The identifier, foo, is declared outside all blocks. 
It can be used anywhere after the declaration until the end of 
the translation unit. */ 

static int foo; 


void test_function(void) 


{ 
foo += 2; 
} 
int main(void) 
{ 
foo = 1; 
test_function() ; 
printf("%d\r\n", foo); //foo = 3; 
return ð; 
} 
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Section 39.4: Function scope 


Function scope is the special scope for labels. This is due to their unusual property. A label is visible through the 
entire function it is defined and one can jump (using instruction gotolabe1) to it from any point in the same 
function. While not useful, the following example illustrate the point: 


#include <stdio.h> 


int main(int argc,char x*argv[]) { 
int a = ð; 
goto INSIDE; 
OUTSIDE : 
if (a!=0) { 
int i=0; 
INSIDE: 
printf("a=%d\n",a); 
goto OUTSIDE; 


INSIDE may seem defined inside the if block, as it is the case for i which scope is the block, but it is not. It is visible 
in the whole function as the instruction goto INSIDE; illustrates. Thus there can't be two labels with the same 


identifier in a single function. 


A possible usage is the following pattern to realize correct complex cleanups of allocated ressources: 


#include <stdlib.h> 
#include <stdio.h> 


void a_function(void) { 
double* a = malloc(sizeof(double[34])) ; 


if ('a) { 
fprintf(stderr, "can't allocate\n"); 
return; /* No point in freeing a if it is null */ 
} 
FILE* b = fopen("some_file","r"); 
in (D) 
fprintf(stderr, “can't open\n"); 
goto CLEANUP1 ; /* Free a; no point in closing b */ 
} 


/* do something reasonable */ 
if (error) { 
fprintf(stderr, "something's wrong\n") ; 
goto CLEANUP2 ; /* Free a and close b to prevent leaks */ 
} 
/* do yet something else */ 
CLEANUP2 : 
close(b); 
CLEANUP1 : 
free(a); 


Labels such as CLEANUP1 and CLEANUP2 are special identifiers that behave differently from all other identifiers. They 
are visible from everywhere inside the function, even in places that are executed before the labeled statement, or 
even in places that could never be reached if none of the goto is executed. Labels are often written in lower-case 


rather than upper-case. 
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Chapter 40: Implicit and Explicit 
Conversions 


Section 40.1: Integer Conversions in Function Calls 


Given that the function has a proper prototype, integers are widened for calls to functions according to the rules of 
integer conversion, C11 6.3.1.3. 


6.3.1.3 Signed and unsigned integers 
When a value with integer type is converted to another integer type other than _Bool, if the value can be 
represented by the new type, it is unchanged. 


Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one 
more than the maximum value that can be represented in the new type until the value is in the range of 
the new type. 


Otherwise, the new type is signed and the value cannot be represented in it; either the result is 
implementation-defined or an implementation-defined signal is raised. 


Usually you should not truncate a wide signed type to a narrower signed type, because obviously the values can't fit 
and there is no clear meaning that this should have. The C standard cited above defines these cases to be 
"implementation-defined", that is, they are not portable. 


The following example supposes that int is 32 bit wide. 


#include <stdio.h> 
#include <stdint.h> 


void param_u8(uint8_t val) { 


printf("%s val is %d\n", __func val); /* val is promoted to int */ 


} 


void param_u16(uint16_t val) { 


printf("%s val is %d\n", __func val); /* val is promoted to int */ 


} 


void param_u32(uint32_t val) { 


printf("%s val is %u\n", __func val); /* here val fits into unsigned */ 


} 


void param_u64(uint64_t val) { 


printf("%s val is " PRI64u "\n" func__, val); /* Fixed with format string */ 


} 


void param_s8(int8_t val) { 


printf("%s val is %d\n", __func val); /* val is promoted to int */ 


} 


void param_s16(int16_t val) { 


printf("%s val is %d\n", __func__, val); /* val is promoted to int */ 
} 
void param_s32(int32_t val) { 
printf("%s val is %d\n", __func__, val); /* val has same width as int */ 


} 
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void param_s64(int64_t val) { 


printf("%s val is " PRI64d "\n", __func val); /* Fixed with format string */ 


} 
int main(void) { 


/* Declare integers of various widths */ 
uint8_t u8 = 127; 
uint8_t s64 = INT64_MAX; 


/* Integer argument is widened when function parameter is wider */ 
param_u8(u8);  /* param_u8 val is 127 */ 

param_u16(u8); /* param_u16 val is 127 */ 

param_u32(u8); /* param_u32 val is 127 */ 

param_u64(u8); /* param_u64 val is 127 */ 

param_s8(u8); /* param_s8 val is 127 */ 

param_s16(u8); /* param_s16 val is 127 */ 

param_s32(u8); /* param_s32 val is 127 */ 

param_s64(u8); /* param_s64 val is 127 */ 


/* Integer argument is truncated when function parameter is narrower */ 
param_u8(s64); /* param_u8 val is 255 */ 

param_u16(s64); /* param_u16 val is 65535 */ 

param_u32(s64); /* param_u32 val is 4294967295 x/ 

param_u64(s64); /* param_u64 val is 9223372036854775807 */ 
param_s8(s64); /* param_s8 val is implementation defined */ 
param_s16(s64); /* param_s16 val is implementation defined */ 
param_s32(s64); /* param_s32 val is implementation defined */ 
param_s64(s64); /* param_s64 val is 9223372036854775807 */ 


return ð; 


Section 40.2: Pointer Conversions in Function Calls 


Pointer conversions to void are implicit, but any other pointer conversion must be explicit. While the compiler 
allows an explicit conversion from any pointer-to-data type to any other pointer-to-data type, accessing an object 
through a wrongly typed pointer is erroneous and leads to undefined behavior. The only case that these are 
allowed are if the types are compatible or if the pointer with which your are looking at the object is a character type. 


#include <stdio.h> 


void func_voidp(void* voidp) { 
printf("%s Address of ptr is %p\n", __func__, voidp); 


} 


/* Structures have same shape, but not same type */ 
struct struct_a { 

int a; 

int b; 
} data_a; 


struct struct_b { 
int a; 
int b; 

} data_b; 


void func_struct_b(struct struct_b* bp) { 


printf("%s Address of ptr is %p\n", __func (void*) bp); 


} 
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int main(void) { 


/* Implicit ptr conversion allowed for void* */ 
func_voidp(&data_a) ; 


Explicit ptr conversion for other types 


the types are not compatible, and that the this call is 
erroneous and leads to undefined behavior on execution. 
*/ 
func_struct_b((struct struct_b*)&data_a); 


* 
* 
* Note that here although the have identical definitions, 
* 
* 


/* My output shows: */ 

/* func_charp Address of ptr is @x6@1@3@ */ 

/* func_voidp Address of ptr is @x601030@ */ 

/* func_struct_b Address of ptr is 0x601030 */ 


return @; 
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Chapter 41: Type Qualifiers 


Section 41.1: Volatile variables 


The volatile keyword tells the compiler that the value of the variable may change at any time as a result of 
external conditions, not only as a result of program control flow. 


The compiler will not optimize anything that has to do with the volatile variable. 


volatile int foo; /* Different ways to declare a volatile variable */ 
int volatile foo; 


volatile uint8_t * pReg; /* Pointers to volatile variable */ 
uint8_t volatile * pReg; 


There are two main reasons to uses volatile variables: 


e To interface with hardware that has memory-mapped I/O registers. 
e When using variables that are modified outside the program control flow (e.g., in an interrupt service routine) 


Let's see this example: 
int quit = false; 


void main() 


{ 
while (!quit) { 
// Do something that does not modify the quit variable 
} 


} 


void interrupt_handler (void) 


{ 
quit = true; 


} 


The compiler is allowed to notice the while loop does not modify the quit variable and convert the loop to a 
endless while (true) loop. Even if the quit variable is set on the signal handler for SIGINT and SIGTERM, the 
compiler does not know that. 


Declaring quit as volatile will tell the compiler to not optimize the loop and the problem will be solved. 
The same problem happens when accessing hardware, as we see in this example: 
uint8_t * pReg = (uint8_t *) @x1717; 


// Wait for register to become non-zero 
while (*pReg == @) { } // Do something else 


The behavior of the optimizer is to read the variable's value once, there is no need to reread it, since the value will 
always be the same. So we end up with an infinite loop. To force the compiler to do what we want, we modify the 
declaration to: 


uint8_t volatile * pReg = (uint8_t volatile *) 9x1717; 


Goalkicker.com - C Notes for Professionals 235 


Section 41.2: Unmodifiable (const) variables 


const int a = ð; /* This variable is "unmodifiable", the compiler 
should throw an error when this variable is changed */ 
int b = ð; /* This variable is modifiable */ 


b += 10; /* Changes the value of 'b' */ 
a += 10; /* Throws a compiler error */ 


The const qualification only means that we don't have the right to change the data. It doesn't mean that the value 
cannot change behind our back. 


-Bool doIt(double const* a) { 
double rememberA = xa; 
// do something long and complicated that calls other functions 


return rememberA == xa; 


During the execution of the other calls xa might have changed, and so this function may return either false or 


true. 
Warning 
Variables with const qualification could still be changed using pointers: 


const int a = @; 


int *a_ptr = (int*)&a; /* This conversion must be explicitly done with a cast */ 
xa_ptr += 10; /*x This has undefined behavior */ 


printf("a = %d\n", a); /* May print: "a = 10" */ 


But doing so is an error that leads to undefined behavior. The difficulty here is that this may behave as expected in 
simple examples as this, but then go wrong when the code grows. 
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Chapter 42: Typedef 


The typedef mechanism allows the creation of aliases for other types. It does not create new types. People often 
use typedef to improve the portability of code, to give aliases to structure or union types, or to create aliases for 
function (or function pointer) types. 


In the C standard, typedef is classified as a 'storage class' for convenience; it occurs syntactically where storage 
classes such as static or extern could appear. 


Section 42.1: Typedef for Structures and Unions 


You can give alias names to a struct: 


typedef struct Person { 
char name[32]; 
int age; 

} Person; 


Person person; 


Compared to the traditional way of declaring structs, programmers wouldn't need to have struct every time they 
declare an instance of that struct. 


Note that the name Person (as opposed to struct Person) is not defined until the final semicolon. Thus for linked 
lists and tree structures which need to contain a pointer to the same structure type, you must use either: 


typedef struct Person { 
char name[32]; 
int age; 
struct Person «next; 
} Person; 


or: 


typedef struct Person Person; 


struct Person { 
char name[32]; 
int age; 
Person *next; 


}; 
The use of a typedef for a union type is very similar. 


typedef union Float Float; 


union Float 


{ 
float f; 
char b[sizeof(float)]; 


Pe 


A structure similar to this can be used to analyze the bytes that make up a float value. 
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Section 42.2: Typedef for Function Pointers 


We can use typedef to simplify the usage of function pointers. Imagine we have some functions, all having the 
same signature, that use their argument to print out something in different ways: 


#include<stdio.h> 


void print_to_n(int n) 


{ 
for (int i = 1; i <= n; ++i) 
printf("%d\n", i); 
} 
void print_n(int n) 
{ 
printf("%d\n, n); 
} 


Now we can use a typedef to create a named function pointer type called printer: 
typedef void (*printer_t) (int); 


This creates a type, named printer_t for a pointer to a function that takes a single int argument and returns 
nothing, which matches the signature of the functions we have above. To use it we create a variable of the created 
type and assign it a pointer to one of the functions in question: 


printer_t p = &print_to_n; 
void (*p)(int) = &print_to_n; // This would be required without the type 


Then to call the function pointed to by the function pointer variable: 


p(5); // Prints 1 2 3 4 5 on separate lines 
(*p) (5); // So does this 


Thus the typedef allows a simpler syntax when dealing with function pointers. This becomes more apparent when 
function pointers are used in more complex situations, such as arguments to functions. 


If you are using a function that takes a function pointer as a parameter without a function pointer type defined the 
function definition would be, 


void foo (void (*printer)(int), int y){ 
//code 
printer(y); 
//code 


However, with the typedef it is: 


void foo (printer_t printer, int y){ 
//code 
printer(y); 
//code 


Likewise functions can return function pointers and again, the use of a typedef can make the syntax simpler when 
doing so. 
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A classic example is the signal function from <signal.h>. The declaration for it (from the C standard) is: 
void (*signal(int sig, void (*func)(int)))(int); 


That's a function that takes two arguments — an int and a pointer to a function which takes an int as an argument 
and returns nothing — and which returns a pointer to function like its second argument. 


If we defined a type SigCatcher as an alias for the pointer to function type: 
typedef void (*SigCatcher) (int) ; 
then we could declare signal() using: 


SigCatcher signal(int sig, SigCatcher func) ; 


On the whole, this is easier to understand (even though the C standard did not elect to define a type to do the job). 
The signal function takes two arguments, an int and a SigCatcher, and it returns a SigCatcher — where a 
SigCatcher is a pointer to a function that takes an int argument and returns nothing. 


Although using typedef names for pointer to function types makes life easier, it can also lead to confusion for 
others who will maintain your code later on, so use with caution and proper documentation. See also Function 
Pointers. 


Section 42.3: Simple Uses of Typedef 
For giving short names to a data type 


Instead of: 


long long int foo; 
struct mystructure object; 


one can use 


/* write once */ 
typedef long long 11; 
typedef struct mystructure mystruct; 


/* use whenever needed */ 


11 foo; 
mystruct object; 


This reduces the amount of typing needed if the type is used many times in the program. 
Improving portability 


The attributes of data types vary across different architectures. For example, an int may be a 2-byte type in one 
implementation and an 4-byte type in another. Suppose a program needs to use a 4-byte type to run correctly. 


In one implementation, let the size of int be 2 bytes and that of long be 4 bytes. In another, let the size of int be 4 
bytes and that of long be 8 bytes. If the program is written using the second implementation, 


/* program expecting a 4 byte integer */ 
int foo; /* need to hold 4 bytes to work */ 
/* some code involving many more ints */ 
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For the program to run in the first implementation, all the int declarations will have to be changed to long. 


/* program now needs long */ 
long foo; /*need to hold 4 bytes to work */ 
/* some code involving many more longs - lot to be changed */ 


To avoid this, one can use typedef 


/* program expecting a 4 byte integer */ 

typedef int myint; /* need to declare once - only one line to modify if needed */ 
myint foo; /* need to hold 4 bytes to work */ 

/* some code involving many more myints */ 


Then, only the typedef statement would need to be changed each time, instead of examining the whole program. 


Version = C99 


The <stdint.h> header and the related <inttypes.h> header define standard type names (using typedef) for 
integers of various sizes, and these names are often the best choice in modern code that needs fixed size integers. 
For example, uint8_t is an unsigned 8-bit integer type; int64_t is a signed 64-bit integer type. The type uintptr_t 
is an unsigned integer type big enough to hold any pointer to object. These types are theoretically optional — but it 
is rare for them not to be available. There are variants like uint_least16_t (the smallest unsigned integer type with 
at least 16 bits) and int_fast32_t (the fastest signed integer type with at least 32 bits). Also, intmax_t and 
uintmax_t are the largest integer types supported by the implementation. These types are mandatory. 


To specify a usage or to improve readability 


If a set of data has a particular purpose, one can use typedef to give it a meaningful name. Moreover, if the 
property of the data changes such that the base type must change, only the typedef statement would have to be 
changed, instead of examining the whole program. 
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Chapter 43: Storage Classes 


A storage class is used to set the scope of a variable or function. By knowing the storage class of a variable, we can 
determine the life-time of that variable during the run-time of the program. 


Section 43.1: auto 


This storage class denotes that an identifier has automatic storage duration. This means once the scope in which 
the identifier was defined ends, the object denoted by the identifier is no longer valid. 


Since all objects, not living in global scope or being declared static, have automatic storage duration by default 
when defined, this keyword is mostly of historical interest and should not be used: 


int foo(void) 

{ 
/* An integer with automatic storage duration. */ 
auto int i = 3; 


/* Same */ 
int joe Ss 


return ð; 
} /* The values of i and j are no longer able to be used. */ 


Section 43.2: register 


Hints to the compiler that access to an object should be as fast as possible. Whether the compiler actually uses the 
hint is implementation-defined; it may simply treat it as equivalent to auto. 


The only property that is definitively different for all objects that are declared with register is that they cannot 
have their address computed. Thereby register can be a good tool to ensure certain optimizations: 


register size_t size = 467; 


is an object that can never alias because no code can pass its address to another function where it might be 
changed unexpectedly. 


This property also implies that an array 


register int array[5]; 


cannot decay into a pointer to its first element (i.e. array turning into &array[0]). This means that the elements of 
such an array cannot be accessed and the array itself cannot be passed to a function. 


In fact, the only legal usage of an array declared with a register storage class is the sizeof operator; any other 
operator would require the address of the first element of the array. For that reason, arrays generally should not be 
declared with the register keyword since it makes them useless for anything other than size computation of the 
entire array, which can be done just as easily without the register keyword. 


The register storage class is more appropriate for variables that are defined inside a block and are accessed with 
high frequency. For example, 


/* prints the sum of the first 5 integers*/ 
/* code assumed to be part of a function body*/ 
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register int k, sum; 
for(k = 1, sum = @; k < 6; sum += k, k++); 
printf ("\t%d\n", sum); 
} 


Version = C11 


The _Alignof operator is also allowed to be used with register arrays. 


Section 43.3: static 


The static storage class serves different purposes, depending on the location of the declaration in the file: 


1. To confine the identifier to that translation unit only (scope=file). 


/* No other translation unit can use this variable. */ 
Static Int i 


/*x Same; static is attached to the function type of f, not the return type int. */ 
static int f(int n); 


2. To save data for use with the next call of a function (scope=block): 


void foo() 
{ 
static int a = ð; /* has static storage duration and its lifetime is the 
* entire execution of the program; initialized to @ on 
* first function call */ 
int b = 0; /* b has block scope and has automatic storage duration and 
* only "exists" within function */ 


a += 16; 
b += 10; 


printf("static int a = %d, int b = %d\n", a, b); 


} 
int main(void) 
{ 
int is 
for (Ga = oi < 5 IFE) 
{ 
foo(); 
} 
return @; 
} 


This code prints: 


static int 
static int 
static int 
static int 
static int 


= 10, int b = 10 
= 20, int b = 10 
30, int b = 10 
= 40, int b = 10 
= 50, int b = 10 


9» 9 9 ® ® 
Il 


Static variables retain their value even when called from multiple different threads. 
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Version = C99 


3. Used in function parameters to denote an array is expected to have a constant minimum number of 
elements and a non-null parameter: 


/* a is expected to have at least 512 elements. */ 
void printInts(int a[static 512]) 


{ 
size_t i; 
for (i =ð; i < 512; ++i) 
printf("%d\n", alil); 
} 


The required number of items (or even a non-null pointer) is not necessarily checked by the compiler, and 
compilers are not required to notify you in any way if you don't have enough elements. If a programmer 
passes fewer than 512 elements or a null pointer, undefined behavior is the result. Since it is impossible to 
enforce this, extra care must be used when passing a value for that parameter to such a function. 


Section 43.4: typedef 
Defines a new type based on an existing type. Its syntax mirrors that of a variable declaration. 


/* Byte can be used wherever ‘unsigned char’ is needed */ 
typedef unsigned char Byte; 


/* Integer is the type used to declare an array consisting of a single int */ 
typedef int Integer[1]; 


/* NodeRef is a type used for pointers to a structure type with the tag "node" */ 
typedef struct node *NodeRef ; 


/*x SigHandler is the function pointer type that gets passed to the signal function. */ 
typedef void (*SigHandler) (int) ; 


While not technically a storage class, a compiler will treat it as one since none of the other storage classes are 
allowed if the typedef keyword is used. 


The typedefs are important and should not be substituted with #define macro. 


typedef int newType; 
newType «ptr; // ptr is pointer to variable of type 'newType' aka int 


However, 


#define int newType 
newlype *ptr; // Even though macros are exact replacements to words, this doesn't result to 
a pointer to variable of type 'newType' aka int 


Section 43.5: extern 


Used to declare an object or function that is defined elsewhere (and that has external linkage). In general, it is 
used to declare an object or function to be used in a module that is not the one in which the corresponding object 
or function is defined: 
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/* filel.c */ 
int foo = 2; /* Has external linkage since it is declared at file scope. */ 


/* file2.c */ 
#include <stdio.h> 
int main(void) 


{ 
/* “extern” keyword refers to external definition of ‘foo’. */ 
extern int foo; 
printf("%d\n", foo); 
return @; 
} 


Version = C99 


Things get slightly more interesting with the introduction of the inline keyword in C99: 


/* Should usually be place in a header file such that all users see the definition */ 
/x Hints to the compiler that the function ‘bar’ might be inlined */ 

/* and suppresses the generation of an external symbol, unless stated otherwise. */ 
inline void bar(int drink) 


{ 


printf("You ordered drink no.%d\n", drink); 


} 


/* To be found in just one .c file. 
Creates an external function definition of ‘bar’ for use by other files. 
The compiler is allowed to choose between the inline version and the external 
definition when ‘bar’ is called. Without this line, ‘bar’ would only be an inline 
function, and other files would not be able to call it. */ 

extern void bar(int); 


Section 43.6: _Thread_local 


Version = C11 


This was a new Storage specifier introduced in C11 along with multi-threading. This isn't available in earlier C 
standards. 


Denotes thread storage duration. A variable declared with _Thread_local storage specifier denotes that the object is 
local to that thread and its lifetime is the entire execution of the thread in which it's created. It can also appear along 
with static or extern. 


#include <threads.h> 
#include <stdio.h> 
#define SIZE 5 


int thread_func(void *id) 


{ 
/* thread local variable i. */ 
static _Thread_local int i; 
/* Prints the ID passed from main() and the address of the i. 
* Running this program will print different addresses for i, showing 
* that they are all distinct objects. */ 
printf("From thread:[%d], Address of i (thread local): %p\n", *(int*)id, (void*)&i) ; 
return @; 
} 


int main(void) 
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thrd_t id[SIZE]; 
int arr ISIZEI = i, a sin Che ihe 


/* create 5 threads. */ 

for(int i= 0; i < SITZE: itt) 4 
thrd_create(&id[i], thread_func, &arr[i]); 

} 


/* wait for threads to complete. */ 

for(int i = @; i < SIZE; itt) { 
thrd_join(id[i], NULL); 

} 
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Chapter 44: Declarations 


Section 44.1: Calling a function from another C file 


foo.h 


#ifndef FOO_DOT_H /* This is an “include guard" */ 

#define FOO_DOT_H /* prevents the file from being included twice. */ 
/* Including a header file twice causes all kinds */ 
/* of interesting problems. */ 


IRK 
* This is a function declaration. 
* It tells the compiler that the function exists somewhere. 
*/ 

void foo(int id, char *name) ; 


#endif /* FOO_DOT_H */ 
foo.c 


#include "foo.h" /* Always include the header file that declares something 
* in the C file that defines it. This makes sure that the 
* declaration and definition are always in-sync. Put this 


* header first in foo.c to ensure the header is self-contained. 


*/ 
#include <stdio.h> 


IAK 
* This is the function definition. 
* It is the actual body of the function which was declared elsewhere. 
*/ 

void foo(int id, char *name) 


{ 
fprintf(stderr, "foo(%d, \"%s\");\n", id, name); 
/* This will print how foo was called to stderr - standard error. 
* e.g., foo(42, "Hi!") will print “foo(42, "Hi!") 
*/ 
} 
main.c 


#include "foo.h" 


int main(void) 

{ 
foo(42, "bar"); 
return ð; 


Compile and Link 


First, we compile both foo.c and main.c to object files. Here we use the gcc compiler, your compiler may have a 


different name and need other options. 


$ gcc -Wall -c foo.c 
$ gcc -Wall -c main.c 
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Now we link them together to produce our final executable: 


$ gcc -o testprogram foo.o main.o 


Section 44.2: Using a Global Variable 


Use of global variables is generally discouraged. It makes your program more difficult to understand, and harder to 


debug. But sometimes using a global variable is acceptable. 


global.h 


#ifndef GLOBAL_DOT_H /* This is an “include guard" */ 
#define GLOBAL_DOT_H 


IAR 
* This tells the compiler that g_myglobal exists somewhere. 
* Without "extern", this would create a new variable named 
* g_myglobal in _every file_ that included it. Don't miss this! 
*/ 
extern int g_myglobal; /* _Declare_ g_myglobal, that is promise it will be _defined_ by 
* some module. */ 


#endif /* GLOBAL_DOT_H */ 


global.c 


#include "global.h" /* Always include the header file that declares something 
* in the C file that defines it. This makes sure that the 
* declaration and definition are always in-sync. 
*/ 


int g_myglobal; /* _Define_ my_global. As living in global scope it gets initialised to 8 


* on program start-up. */ 


main.c 


#include "global.h" 


int main(void) 

{ 
g_myglobal = 42; 
return ð; 


See also How do | use extern to share variables between source files? 


Section 44.3: Introduction 


Example of declarations are: 
int a; /* declaring single identifier of type int */ 
The above declaration declares single identifier named a which refers to some object with int type. 


int al, b1; /* declaring 2 identifiers of type int */ 
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The second declaration declares 2 identifiers named a1 and b1 which refers to some other objects though with the 
same int type. 


Basically, the way this works is like this - first you put some type, then you write a single or multiple expressions 
separated via comma (,) (which will not be evaluated at this point - and which should otherwise be referred 
to as declarators in this context). In writing such expressions, you are allowed to apply only the indirection (+), 
function call (( )) or subscript (or array indexing - [ ]) operators onto some identifier (you can also not use any 
operators at all). The identifier used is not required to be visible in the current scope. Some examples: 


LET a nt ie Z ty) (rzy VEI cay EAA Cs A dey LY! ede JA O 9) | Y 

# Description 

1 The name of integer type. 

2 Un-evaluated expression applying indirection to some identifier z. 

3 We have a comma indicating that one more expression will follow in the same declaration. 
4 Un-evaluated expression applying indirection to some other identifier x. 

5 Un-evaluated expression applying indirection to the value of the expression («c). 

6 End of declaration. 


Note that none of the above identifiers were visible prior to this declaration and so the expressions used would not be 
valid before it. 


After each such expression, the identifier used in it is introduced into the current scope. (If the identifier has 
assigned linkage to it, it may also be re-declared with the same type of linkage so that both identifiers refer to the 
same object or function) 


Additionally, the equal operator sign (=) may be used for initialization. If an unevaluated expression (declarator) is 
followed by = inside the declaration - we say that the identifier being introduced is also being initialized. After the = 
sign we can put once again some expression, but this time it'll be evaluated and its value will be used as initial for 
the object declared. 


Examples: 

int 1 = 90; /* the same as: */ 

int 1; 1 = 98; /* if it the declaration of 1 was in block scope */ 

int c = 2, b[c]; /* ok, equivalent to: */ 

int c = 2; int bic]; 

Later in your code, you are allowed to write the exact same expression from the declaration part of the newly 


introduced identifier, giving you an object of the type specified at the beginning of the declaration, assuming that 
you've assigned valid values to all accessed objects in the way. Examples: 


void f() 
{ 


int b2; /* you should be able to write later in your code b2 
which will directly refer to the integer object 
that b2 identifies */ 

b2 = 2; /* assign a value to b2 */ 

printf("%d", b2); /*ok - should print 2%*/ 


int *b3; /* you should be able to write later in your code *b3 */ 
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b3 = &b2; /* assign valid pointer value to b3 */ 

printf("%d", *b3); /* ok - should print 2 */ 

int **b4; /* you should be able to write later in your code **b4 */ 

b4 = &b3; 

printf("%d", **b4); /* ok - should print 2 */ 

void (*p)(); /* you should be able to write later in your code (*p)() */ 
p = &f; /* assign a valid pointer value */ 


(*p)(); /* ok - calls function f by retrieving the 


pointer value inside p - p 

and dereferencing it - *p 
resulting in a function 

which is then called - (*p)() - 


it is not *p() because else first the () operator is 
applied to p and then the resulting void object is 
dereferenced which is not what we want here */ 


The declaration of b3 specifies that you can potentially use b3 value as a mean to access some integer object. 


Of course, in order to apply indirection (*) to b3, you should also have a proper value stored in it (see pointers for 
more info). You should also first store some value into an object before trying to retrieve it (you can see more about 
this problem here). We've done all of this in the above examples. 


int a3(); /* you should be able to call a3 */ 


This one tells the compiler that you'll attempt to call a3. In this case a3 refers to function instead of an object. One 
difference between object and function is that functions will always have some sort of linkage. Examples: 


void f1() 
{ 
{ 
int f2(); /* 1 refers to some function f2 */ 
} 
{ 
int f2(); /* refers to the exact same function f2 as (1) */ 
} 


In the above example, the 2 declarations refer to the same function f2, whilst if they were declaring objects then in 
this context (having 2 different block scopes), they would have be 2 different distinct objects. 


int (*a3)(); /* you should be able to apply indirection to ‘a3° and then call it */ 


Now it may seems to be getting complicated, but if you know operators precedence you'll have 0 problems reading 
the above declaration. The parentheses are needed because the * operator has less precedence then the ( ) one. 


In the case of using the subscript operator, the resulting expression wouldn't be actually valid after the declaration 
because the index used in it (the value inside [ and ]) will always be 1 above the maximum allowed value for this 
object/function. 
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int a4[5]; /* here a4 shouldn't be accessed using the index 5 later on */ 
But it should be accessible by all other indexes lower then 5. Examples: 

a4[@], a4[1]; a4[4]; 

a4[5] will result into UB. More information about arrays can be found here. 


int (*a5)[5](); /* here a4 could be applied indirection 
indexed up to (but not including) 5 
and called */ 


Unfortunately for us, although syntactically possible, the declaration of a5 is forbidden by the current standard. 


Section 44.4: Typedef 


Typedefs are declarations which have the keyword typedef in front and before the type. E.g.: 

typedef int (*(*t@)())[5]; 

(you can technically put the typedef after the type too - like this int typedef (*(*t@)())[5]; but this is discouraged) 
The above declarations declares an identifier for a typedef name. You can use it like this afterwards: 

tð pf; 
Which will have the same effect as writing: 

int (*(*pf)())[5]; 


As you can see the typedef name "saves" the declaration as a type to use later for other declarations. This way you 
can save some keystrokes. Also as declaration using typedef is still a declaration you are not limited only by the 
above example: 


tO (*pf1); 
Is the same as: 


int (*(**pf1)())(51; 


Section 44.5: Using Global Constants 
Headers may be used to declare globally used read-only resources, like string tables for example. 


Declare those in a separate header which gets included by any file ("Translation Unit") which wants to make use of 
them. It's handy to use the same header to declare a related enumeration to identify all string-resources: 


resources.h: 


#ifndef RESOURCES_H 
#define RESOURCES_H 


typedef enum { /* Define a type describing the possible valid resource IDs. */ 
RESOURCE_UNDEFINED = -1, /%* To be used to initialise any EnumResourceID typed variable to be 
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marked as "not in use", "not in list", "undefined", wtf. 
Will say un-initialised on application level, not on language level. 
Initialised uninitialised, so to say ;-) 
Its like NULL for pointers ;-)*/ 
RESOURCE_UNKNOWN = @, /* To be used if the application uses some resource ID, 
for which we do not have a table entry defined, a fall back in 
case we _need_ to display something, but do not find anything 
appropriate. */ 


/* The following identify the resources we have defined: */ 
RESOURCE_OK, 

RESOURCE_CANCEL , 

RESOURCE_ABORT, 

/* Insert more here. */ 


RESOURCE_MAX /* The maximum number of resources defined. */ 
} EnumResourcelID; 


extern const char * const resources[RESOURCE_MAX]; /* Declare, promise to anybody who includes 
this, that at linkage-time this symbol will be around. 
The 1st const guarantees the strings will not change, 
the 2nd const guarantees the string-table entries 
will never suddenly point somewhere else as set during 
initialisation. */ 

#endif 


To actually define the resources created a related .c-file, that is another translation unit holding the actual instances 
of the what had been declared in the related header (.h) file: 


resources.c: 


#include "resources.h" /* To make sure clashes between declaration and definition are 
recognised by the compiler include the declaring header into 
the implementing, defining translation unit (.c file). 


/* Define the resources. Keep the promise made in resources.h. */ 
const char * const resources[RESOURCE_MAX] = { 

"<unknown>", 

"OK", 

"Cancel", 

"Abort" 
i; 


A program using this could look like this: 
main.c: 


#include <stdlib.h> /* for EXIT_SUCCESS */ 
#include <stdio.h> 


#include "resources.h" 
int main(void) 
{ 


EnumResourceID resource_id = RESOURCE_UNDEFINED; 


while ((++resource_id) < RESOURCE_MAX) 
{ 
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printf("resource ID: %d, resource: '%s'\n", resource_id, resources[resource_id] ); 


} 


return EXE? SUCCESS: 
} 


Compile the three file above using GCC, and link them to become the program file main for example using this: 
gcc -Wall -Wextra -pedantic -Wconversion -g main.c resources.c -o main 


(use these -Wall -Wextra -pedantic -Wconversion to make the compiler really picky, so you don't miss anything 
before posting the code to SO, will say the world, or even worth deploying it into production) 


Run the program created: 
$ ./main 


And get: 


resource ID: 0, resource: '' 
resource ID: 1, resource: 'OK' 
resource ID: 2, resource: 'Cancel' 
resource ID: 3, resource: '‘Abort' 


Section 44.6: Using the right-left or spiral rule to decipher C 
declaration 


The "right-left" rule is a completely regular rule for deciphering C declarations. It can also be useful in creating 
them. 


Read the symbols as you encounter them in the declaration... 


* as "pointer to" - always on the left side 
[] as "array of" - always on the right side 
() as "function returning" - always on the right side 


How to apply the rule 

STEP 1 

Find the identifier. This is your starting point. Then say to yourself, "identifier is." You've started your declaration. 
STEP 2 


Look at the symbols on the right of the identifier. If, say, you find () there, then you know that this is the 
declaration for a function. So you would then have "identifier is function returning". Or if you found a [] there, you 
would say “identifier is array of". Continue right until you run out of symbols OR hit a right parenthesis ). (If you hit a 
left parenthesis (, that's the beginning of a () symbol, even if there is stuff in between the parentheses. More on 
that below.) 


STEP 3 


Look at the symbols to the left of the identifier. If it is not one of our symbols above (say, something like "int"), just 
say it. Otherwise, translate it into English using that table above. Keep going left until you run out of symbols OR hit 
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a left parenthesis (. 


Now repeat steps 2 and 3 until you've formed your declaration. 
Here are some examples: 

int *p[]; 

First, find identifier: 


int *p[]; 


ict 


"p is 


Now, move right until out of symbols or right parenthesis hit. 


int *p[]; 


AN 


"p is array of" 


Can't move right anymore (out of symbols), so move left and find: 


int *p[]; 


"p is array of pointer to" 
Keep going left and find: 


int *p[]; 


AANA 


"p is array of pointer to int”. 


(or "p is an array where each element is of type pointer to int") 
Another example: 

int *(*fune())(); 

Find the identifier. 


int *(*func())(); 


AAAA 


"func is" 
Move right. 


int *(*func())(); 


AN 


“func is function returning" 
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Can't move right anymore because of the right parenthesis, so move left. 


int *(*fune())(); 


A 


"func is function returning pointer to" 


Can't move left anymore because of the left parenthesis, so keep going right. 


int «(*fune())(); 


AA 


"func is function returning pointer to function returning" 


Can't move right anymore because we're out of symbols, so go left. 


int *(*fune())(); 


A 


"func is function returning pointer to function returning pointer to" 


And finally, keep going left, because there's nothing left on the right. 


int *(*func())(); 


AANA 


“func is function returning pointer to function returning pointer to int". 


As you can see, this rule can be quite useful. You can also use it to sanity check yourself while you are creating 
declarations, and to give you a hint about where to put the next symbol and whether parentheses are required. 


Some declarations look much more complicated than they are due to array sizes and argument lists in prototype 
form. If you see [3], that's read as "array (size 3) of...". If you see (char *, int) that's read as *"function expecting 
(char ,int) and returning...". 


Here's a fun one: 
int (*(*fun_one)(char *,double) )[9][20]; 
| won't go through each of the steps to decipher this one. 
*"fun_one is pointer to function expecting (char ,double) and returning pointer to array (size 9) of array (size 20) of int." 
As you can see, it's not as complicated if you get rid of the array sizes and argument lists: 
int (*(*fun_one)())[][]; 
You can decipher it that way, and then put in the array sizes and argument lists later. 


Some final words: 


It is quite possible to make illegal declarations using this rule, so some knowledge of what's legal in C is necessary. 
For instance, if the above had been: 


int *((*fun_one)()) LIL]; 
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it would have read "fun_one is pointer to function returning array of array of pointer to int". Since a function cannot 


return an array, but only a pointer to an array, that declaration is illegal. 


Illegal combinations include: 


[]( 
O 
Ol 


]() - cannot have an array of functions 


) - cannot have a function that returns a function 


)[] - cannot have a function that returns an array 


In all the above cases, you would need a set of parentheses to bind a * symbol on the left between these () and [] 
right-side symbols in order for the declaration to be legal. 


Here are some more examples: 


Legal 


int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 


Illegal 


int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 
int 


an int 

an int pointer (ptr to an int) 

an array of ints 

a function returning an int 

a pointer to an int pointer (ptr to a ptr to an int) 
a pointer to an array of ints 

a pointer to a function returning an int 

an array of int pointers (array of ptrs to ints) 

an array of arrays of ints 

a function returning an int pointer 

a pointer to a pointer to an int pointer 

a pointer to a pointer to an array of ints 

a pointer to a pointer to a function returning an int 
a pointer to an array of int pointers 

a pointer to an array of arrays of ints 

a pointer to a function returning an int pointer 

an array of pointers to int pointers 

an array of pointers to arrays of ints 

an array of pointers to functions returning an int 
an array of arrays of int pointers 

an array of arrays of arrays of int 

a function returning a pointer to an int pointer 

a function returning a pointer to an array of ints 

a function returning a pointer to a function returning an int 


an array of functions returning an int 

a function returning an array of ints 

a function returning a function returning an int 

a pointer to a function returning an array of ints 

an array of arrays of functions returning an int 

a pointer to a an array of functions returning an int 

a pointer to a function returning a function returning an int 
an array of functions returning int pointers 

an array of functions returning an array of ints 

an array of functions returning functions returning an int 
a function returning an array of int pointers 

a function returning an array of arrays of ints 

a function returning an array of functions returning an int 
a function returning a function returning an int pointer 


Source: http://ieng9.ucsd.edu/~cs30x/rt_It.rule.htm! 
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Chapter 45: Structure Padding and 
Packing 


By default, C compilers lay out structures so that each member can be accessed fast, without incurring penalties for 
‘unaligned access, a problem with RISC machines such as the DEC Alpha, and some ARM CPUs. 


Depending on the CPU architecture and the compiler, a structure may occupy more space in memory than the sum 
of the sizes of its component members. The compiler can add padding between members or at the end of the 
structure, but not at the beginning. 


Packing overrides the default padding. 


Section 45.1: Packing structures 


By default structures are padded in C. If you want to avoid this behaviour, you have to explicitly request it. Under 
GCC it's __attribute__((__packed__) ). Consider this example on a 64-bit machine: 


struct foo { 
char *p; /* 8 bytes */ 
char cs /* 1 byte */ 
long x; /* 8 bytes */ 
es 


The structure will be automatically padded to have8-byte alignment and will look like this: 


struct foo { 
char *p; /* 8 bytes */ 
char c; /* 1 byte */ 


char pad[7]; /* 7 bytes added by compiler */ 


ong x /* 8 bytes */ 
J 


So sizeof (struct foo) will give us 24 instead of 17. This happened because of a 64 bit compiler read/write from/to 
Memory in 8 bytes of word in each step and obvious when try to write char c; a one byte in memory a complete 8 

bytes (i.e. word) fetched and consumes only first byte of it and its seven successive of bytes remains empty and not 
accessible for any read and write operation for structure padding. 


Structure packing 
But if you add the attribute packed, the compiler will not add padding: 


struct __attribute__((__packed__)) foo { 


char *p; /* 8 bytes */ 
char C; /* 1 byte */ 
long x; /* 8 bytes */ 
Ji 
Now sizeof (struct foo) will return 17. 


Generally packed structures are used: 


e To save space. 
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e To format a data structure to transmit over network without depending on each architecture alignment of 
each node of the network. 


It must be taken in consideration that some processors such as the ARM Cortex-M0 do not allow unaligned memory 
access; in such cases, structure packing can lead to undefined behaviour and can crash the CPU. 


Section 45.2: Structure padding 
Suppose this struct is defined and compiled with a 32 bit compiler: 


struct test_32 { 


int a; // 4 byte 

short b; // 2 byte 

wae tee // 4 byte 
PSUR 32: 


We might expect this struct to occupy only 10 bytes of memory, but by printing sizeof (str_32) we see it uses 12 
bytes. 


This happened because the compiler aligns variables for fast access. A common pattern is that when the base type 
occupies N bytes (where N is a power of 2 such as 1, 2, 4, 8, 16 — and seldom any bigger), the variable should be 
aligned on an N-byte boundary (a multiple of N bytes). 


For the structure shown with sizeof (int) == 4 and sizeof (short) == 2,a common layout is: 


e int a; stored at offset 0; size 4. 

e short b; stored at offset 4; size 2. 
unnamed padding at offset 6; size 2. 
e int c; stored at offset 8; size 4. 


Thus struct test_32 occupies 12 bytes of memory. In this example, there is no trailing padding. 


The compiler will ensure that any struct test_32 variables are stored starting on a 4-byte boundary, so that the 
members within the structure will be properly aligned for fast access. Memory allocation functions such as 
malloc(), calloc() and realloc() are required to ensure that the pointer returned is sufficiently well aligned for 
use with any data type, so dynamically allocated structures will be properly aligned too. 


You can end up with odd situations such as on a 64-bit Intel x86_64 processor (e.g. Intel Core i7 — a Mac running 
macOS Sierra or Mac OS X), where when compiling in 32-bit mode, the compilers place double aligned on a 4-byte 
boundary; but, on the same hardware, when compiling in 64-bit mode, the compilers place double aligned on an 8- 
byte boundary. 
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Chapter 46: Memory management 


name description 

size (malloc, realloc and total size of the memory in bytes. For aligned_alloc the size must be a 
aligned_alloc) integral multiple of alignment. 
size (calloc) size of each element 
nelements number of elements 

h pointer to allocated memory previously returned by malloc, calloc, realloc 
P or aligned_alloc 
alignment alignment of allocated memory 


For managing dynamically allocated memory, the standard C library provides the functions malloc (), calloc(), 
realloc() and free(). In C99 and later, there is also aligned_alloc(). Some systems also provide alloca(). 


Section 46.1: Allocating Memory 
Standard Allocation 


The C dynamic memory allocation functions are defined in the <stdlib.h> header. If one wishes to allocate 
memory space for an object dynamically, the following code can be used: 


int *p = malloc(10 * sizeof *p); 
if (p == NULL) 
{ 
perror("malloc() failed"); 
return -1; 


This computes the number of bytes that ten ints occupy in memory, then requests that many bytes from malloc 
and assigns the result (i.e., the starting address of the memory chunk that was just created using malloc) to a 
pointer named p. 


It is good practice to use sizeof to compute the amount of memory to request since the result of sizeof is 
implementation defined (except for character types, which are char, signed char and unsigned char, for which 
sizeof is defined to always give 1). 


Because malloc might not be able to service the request, it might return a null pointer. It is important to 
check for this to prevent later attempts to dereference the null pointer. 


Memory dynamically allocated using malloc() may be resized using realloc() or, when no longer needed, 
released using free(). 


Alternatively, declaring int array[1@]; would allocate the same amount of memory. However, if it is declared 
inside a function without the keyword static, it will only be usable within the function it is declared in and the 
functions it calls (because the array will be allocated on the stack and the space will be released for reuse when the 
function returns). Alternatively, if it is defined with static inside a function, or if it is defined outside any function, 
then its lifetime is the lifetime of the program. Pointers can also be returned from a function, however a function in 
C can not return an array. 


Zeroed Memory 


The memory returned by malloc may not be initialized to a reasonable value, and care should be taken to zero the 
memory with memset or to immediately copy a suitable value into it. Alternatively, calloc returns a block of the 
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desired size where all bits are initialized to 8. This need not be the same as the representation of floating-point zero 
or a null pointer constant. 


int *p = calloc(19, sizeof *p); 
if (p == NULL) 
{ 
perror("calloc() failed"); 
return -1; 


A note on calloc: Most (commonly used) implementations will optimise calloc() for performance, so it will be 
faster than calling malloc(), then memset (), even though the net effect is identical. 


Aligned Memory 


Version = C11 


C11 introduced a new function aligned_alloc() which allocates space with the given alignment. It can be used if 
the memory to be allocated is needed to be aligned at certain boundaries which can't be satisfied by malloc() or 
calloc().malloc() and calloc() functions allocate memory that's suitably aligned for any object type (i.e. the 
alignment is alignof (max_align_t)). But with aligned_alloc() greater alignments can be requested. 


/* Allocates 1024 bytes with 256 bytes alignment. */ 
char *ptr = aligned_alloc(256, 1024); 


if (ptr) { 
perror("aligned_alloc()"); 
return -1; 

} 

free(ptr); 


The C11 standard imposes two restrictions: 1) the size (second argument) requested must be an integral multiple of 
the alignment (first argument) and 2) the value of alignment should be a valid alignment supported by the 
implementation. Failure to meet either of them results in undefined behavior. 


Section 46.2: Freeing Memory 
It is possible to release dynamically allocated memory by calling free(). 


int *p = malloc(1@ * sizeof *p); /* allocation of memory */ 
if (p == NULL) 
{ 

perror("malloc failed"); 

return -1; 


} 


free(p); /* release of memory */ 
/* note that after free(p), even using the *value* of the pointer p 
has undefined behavior, until a new value is stored into it. */ 


/* reusing/re-purposing the pointer itself */ 
int i = 42; 
p = &i; /* This is valid, has defined behaviour */ 


The memory pointed to by p is reclaimed (either by the libc implementation or by the underlying OS) after the call 
to free(), so accessing that freed memory block via p will lead to undefined behavior. Pointers that reference 
memory elements that have been freed are commonly called dangling pointers, and present a security risk. 
Furthermore, the C standard states that even accessing the value of a dangling pointer has undefined behavior. 
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Note that the pointer p itself can be re-purposed as shown above. 


Please note that you can only call free() on pointers that have directly been returned from the malloc(), 
calloc(), realloc() and aligned_alloc() functions, or where documentation tells you the memory has been 
allocated that way (functions like strdup () are notable examples). Freeing a pointer that is, 


e obtained by using the & operator on a variable, or 
e in the middle of an allocated block, 


is forbidden. Such an error will usually not be diagnosed by your compiler but will lead the program execution in an 
undefined state. 


There are two common strategies to prevent such instances of undefined behavior. 


The first and preferable is simple - have p itself cease to exist when it is no longer needed, for example: 


if (something_is_needed() ) 


{ 


int *p = malloc(1@ * sizeof *p); 
if (p == NULL) 
{ 


perror("malloc failed"); 
return -1; 


} 


/* do whatever is needed with p */ 


free(p); 


By calling free() directly before the end of the containing block (i.e. the }), p itself ceases to exist. The compiler will 
give a compilation error on any attempt to use p after that. 


A second approach is to also invalidate the pointer itself after releasing the memory to which it points: 


free(p); 
p = NULL; // you may also use @ instead of NULL 


Arguments for this approach: 


e On many platforms, an attempt to dereference a null pointer will cause instant crash: Segmentation fault. 
Here, we get at least a stack trace pointing to the variable that was used after being freed. 


Without setting pointer to NULL we have dangling pointer. The program will very likely still crash, but later, 
because the memory to which the pointer points will silently be corrupted. Such bugs are difficult to trace 
because they can result in a call stack that completely unrelated to the initial problem. 


This approach hence follows the fail-fast concept. 


e It is safe to free a null pointer. The C Standard specifies that free(NULL) has no effect: 


The free function causes the space pointed to by ptr to be deallocated, that is, made available for 
further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not 
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match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has 
been deallocated by a call to free or realloc, the behavior is undefined. 


e Sometimes the first approach cannot be used (e.g. memory is allocated in one function, and deallocated 
much later in a completely different function) 


Section 46.3: Reallocating Memory 


You may need to expand or shrink your pointer storage space after you have allocated memory to it. The void 
xrealloc(void *ptr, size_t size) function deallocates the old object pointed to by ptr and returns a pointer to 
an object that has the size specified by size. ptr is the pointer to a memory block previously allocated with malloc, 
calloc or realloc (or a null pointer) to be reallocated. The maximal possible contents of the original memory is 
preserved. If the new size is larger, any additional memory beyond the old size are uninitialized. If the new size is 


shorter, the contents of the shrunken part is lost. If ptr is NULL, a new block is allocated and a pointer to it is 
returned by the function. 


#include <stdio.h> 
#include <stdlib.h> 


int main(void) 

{ 
int *p = malloc(10 * sizeof «p); 
if (NULL == p) 


{ 
perror("malloc() failed"); 
return EXIT_FAILURE; 

} 

pl@] = 42; 

p[9] = 15; 


/x Reallocate array to a larger size, storing the result into a 
* temporary pointer in case realloc() fails. */ 


{ 
int *temporary = realloc(p, 1000000 * sizeof *temporary) ; 
/* realloc() failed, the original allocation was not free'd yet. */ 
if (NULL == temporary) 
{ 
perror("realloc() failed"); 
free(p); /* Clean up. */ 
return EXIT_FAILURE; 
} 
p = temporary; 
} 


/* From here on, array can be used with the new size it was 
x realloc'ed to, until it is free'd. */ 


/* The values of p[@] to p[9] are preserved, so this will print: 
42 15 


*/ 
printf("%d %d\n", p[@], p[9]); 


free(p); 
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return EXIT_SUCCESS; 


The reallocated object may or may not have the same address as *p. Therefore it is important to capture the return 
value from realloc which contains the new address if the call is successful. 


Make sure you assign the return value of realloc to a temporary instead of the original p. realloc will return null in 
case of any failure, which would overwrite the pointer. This would lose your data and create a memory leak. 


Section 46.4: realloc(ptr, 0) is not equivalent to free(ptr) 
realloc is conceptually equivalent to malloc + memcpy + free on the other pointer. 


If the size of the space requested is zero, the behavior of realloc is implementation-defined. This is similar for all 
memory allocation functions that receive a size parameter of value 8. Such functions may in fact return a non-null 
pointer, but that must never be dereferenced. 


Thus, realloc(ptr, 9) is not equivalent to free(ptr). It may 


e bea "lazy" implementation and just return ptr 

e free(ptr), allocate a dummy element and return that 
e free(ptr) and return 8 

e just return 8 for failure and do nothing else. 


So in particular the latter two cases are indistinguishable by application code. 


This means realloc(ptr,2%) may not really free/deallocate the memory, and thus it should never be used as a 
replacement for free. 


Section 46.5: Multidimensional arrays of variable size 


Version = C99 


Since C99, C has variable length arrays, VLA, that model arrays with bounds that are only known at initialization 
time. While you have to be careful not to allocate too large VLA (they might smash your stack), using pointers to VLA 
and using them in sizeof expressions is fine. 


double sumAll(size_t n, size_t m, double A[n][m]) { 
double ret = 0.0; 
for (size_t i = @; i <n; ++i) 
for (size_t j = 0; j < m; ++j) 
ret += A[i]l[j] 
return ret; 


} 


int main(int argc, char xargv[argc+1]) { 
size_t n = argc*10; 
size_t m = argcx*8; 
double (*matrix)[m] = malloc(sizeof(double[n][m])); 
// initialize matrix somehow 
double res = sumAll(n, m, matrix); 
printf("result is %g\n", res); 
free(matrix) ; 


Here matrix is a pointer to elements of type double[m], and the sizeof expression with double[n][m] ensures that 
it contains space for n such elements. 
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All this space is allocated contiguously and can thus be deallocated by a single call to free. 


The presence of VLA in the language also affects the possible declarations of arrays and pointers in function 
headers. Now, a general integer expression is permitted inside the [] of array parameters. For both functions the 
expressions in [ ] use parameters that have declared before in the parameter list. For sumA11 these are the lengths 
that the user code expects for the matrix. As for all array function parameters in C the innermost dimension is 
rewritten to a pointer type, so this is equivalent to the declaration 


double sumAll(size_t n, size_t m, double (*A)[m]); 


That is, n is not really part of the function interface, but the information can be useful for documentation and it 
could also be used by bounds checking compilers to warn about out-of-bounds access. 


Likwise, for main, the expression argc+1 is the minimal length that the C standard prescribes for the argv argument. 


Note that officially VLA support is optional in C11, but we know of no compiler that implements C11 and that 
doesn't have them. You could test with the macro __STDC_NO_VLA__ if you must. 


Section 46.6: alloca: allocate memory on stack 


Caveat: alloca is only mentioned here for the sake of completeness. It is entirely non-portable (not covered by any 
of the common standards) and has a number of potentially dangerous features that make it un-safe for the 
unaware. Modern C code should replace it with Variable Length Arrays (VLA). 


Manual page 


#include <alloca.h> 
// glibc version of stdlib.h include alloca.h by default 


void foo(int size) { 
char *data = alloca(size) ; 
/* 
function body; 
*/ 
// data is automatically freed 


Allocate memory on the stack frame of the caller, the space referenced by the returned pointer is automatically 
free'd when the caller function finishes. 


While this function is convenient for automatic memory management, be aware that requesting large allocation 
could cause a stack overflow, and that you cannot use free with memory allocated with alloca (which could cause 
more issue with stack overflow). 


For these reason it is not recommended to use alloca inside a loop nor a recursive function. 


And because the memory is free'd upon function return you cannot return the pointer as a function result (the 
behavior would be undefined). 


Summary 


e call identical to malloc 

e automatically free'd upon function return 

e incompatible with free,realloc functions (undefined behavior) 

e pointer cannot be returned as a function result (undefined behavior) 
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e allocation size limited by stack space, which (on most machines) is a lot smaller than the heap space available 
for use by malloc() 

e avoid using alloca() and VLAs (variable length arrays) in a single function 

e alloca() is not as portable as malloc() et al 


Recommendation 


e Do not use alloca() in new code 


Version = C99 


Modern alternative. 


void foo(int size) { 
char data[size] ; 
/* 
function body; 
*/ 
// data is automatically freed 


This works where alloca() does, and works in places where alloca() doesn't (inside loops, for example). It does 
assume either a C99 implementation or a C11 implementation that does not define _._STDC_NO_VLA__. 


Section 46.7: User-defined memory management 


malloc() often calls underlying operating system functions to obtain pages of memory. But there is nothing special 
about the function and it can be implemented in straight C by declaring a large static array and allocating from it 
(there is a slight difficulty in ensuring correct alignment, in practice aligning to 8 bytes is almost always adequate). 


To implement a simple scheme, a control block is stored in the region of memory immediately before the pointer to 
be returned from the call. This means that f ree() may be implemented by subtracting from the returned pointer 
and reading off the control information, which is typically the block size plus some information that allows it to be 
put back in the free list - a linked list of unallocated blocks. 


When the user requests an allocation, the free list is searched until a block of identical or larger size to the amount 
requested is found, then if necessary it is split. This can lead to memory fragmentation if the user is continually 
making many allocations and frees of unpredictable size and and at unpredictable intervals (not all real programs 
behave like that, the simple scheme is often adequate for small programs). 


/* typical control block */ 
struct block 


{ 
size_t size; /* size of block */ 
struct block *next; /* next block in free list */ 
struct block «prev; /* back pointer to previous block in memory */ 
void «padding; /* need 16 bytes to make multiple of 8 */ 
} 


static struct block arena[10@0@]; /* allocate from here */ 
static struct block *firstfree; 


Many programs require large numbers of allocations of small objects of the same size. This is very easy to 
implement. Simply use a block with a next pointer. So if a block of 32 bytes is required: 


union block 
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union block * next; 
unsigned char payload[32]; 
} 


static union block arena[10@]; 
static union block * head; 
void init(void) 


{ 
imt ai 
for (i = 9; i < 100 - 1; i++) 
arena[i].next = &arena[i + 1]; 
arena[i].next = ðO; /* last one, null */ 
head = &block[@]; 
} 
void *block_alloc() 
{ 
void *answer = head; 
if (answer) 
head = head->next; 
return answer; 
} 
void block_free(void *ptr) 
{ 
union block *block = ptr; 
block->next = head; 
head - block; 
} 


This scheme is extremely fast and efficient, and can be made generic with a certain loss of clarity. 
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Chapter 47: Implementation-defined 
behaviour 


Section 47.1: Right shift of a negative integer 


int signed_integer = -1; 


// The right shift operation exhibits implementation-defined behavior: 
int result = signed_integer >> 1; 


Section 47.2: Assigning an out-of-range value to an integer 


// Supposing SCHAR_MAX, the maximum value that can be represented by a signed char, is 
// 127, the behavior of this assignment is implementation-defined: 

signed char integer; 

integer = 128; 


Section 47.3: Allocating zero bytes 


// The allocation functions have implementation-defined behavior when the requested size 
// of the allocation is zero. 
void xp = malloc(@); 


Section 47.4: Representation of signed integers 


Each signed integer type may be represented in any one of three formats; it is implementation-defined which one is 
used. The implementation in use for any given signed integer type at least as wide as int can be determined at 
runtime from the two lowest-order bits of the representation of value -1 in that type, like so: 


enum { sign_magnitude = 1, ones_compl = 2, twos_compl = 3, }; 
#define SIGN_REP(T) ((T)-1 & (T)3) 


switch (SIGN_REP(long)) { 
case sign_magnitude: { /* do something */ break; } 
case ones_comp1: { /* do otherwise */ break; } 
case twos_comp1: { /* do yet else */ break; } 
case @: { _Static_assert(SIGN_REP(long), "bogus sign representation"); } 


The same pattern applies to the representation of narrower types, but they cannot be tested by this technique 
because the operands of & are subject to "the usual arithmetic conversions" before the result is computed. 
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Chapter 48: Atomics 


Section 48.1: atomics and operators 
Atomic variables can be accessed concurrently between different threads without creating race conditions. 


/* a global static variable that is visible by all threads */ 
static unsigned _Atomic active = ATOMIC_VAR_INIT(8) ; 


int myThread(void* a) { 


++active; // increment active race free 
// do something 

--active; // decrement active race free 
return @; 


All lvalue operations (operations that modify the object) that are allowed for the base type are allowed and will not 
lead to race conditions between different threads that access them. 


e Operations on atomic objects are generally orders of magnitude slower than normal arithmetic operations. 
This also includes simple load or store operations. So you should only use them for critical tasks. 

e Usual arithmetic operations and assignment such asa = a+1; are in fact three operations on a: first a load, 
then addition and finally a store. This is not race free. Only the operation a += 1; and a++; are. 
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Chapter 49: Jump Statements 


Section 49.1: Using return 


Returning a value 


One commonly used case: returning from main( ) 


#include <stdlib.h> /* for EXIT XXX macros */ 


int main(int argc, char ** argv) 


{ 


if (2 < argc) 


return EXIT_FAILURE; /* The code expects one argument: 
leave immediately skipping the rest of the function's code */ 


} 
/* Do stuff. */ 


return EXET SUCCESS. 


Additional notes: 


1. For a function having a return type as void (not including void * or related types), the return statement 
should not have any associated expression; i.e, the only allowed return statement would be return;. 


2. For a function having a non-void return type, the return statement shall not appear without an expression. 


3. For main() (and only for main()), an explicit return statement is not required (in C99 or later). If the execution 
reaches the terminating }, an implicit value of @ is returned. Some people think omitting this return is bad 
practice; others actively suggest leaving it out. 


Returning nothing 


Returning from a void function 


void log(const char * message_to_log) 


{ 


if (NULL == message_to_log) 


{ 
return; /* Nothing to log, go home NOW, skip the logging. */ 


} 


fprintf(stderr, "%s:%d %s\n", __FILE__, _LINE__, message_to_log) ; 


return; /* Optional, as this function does not return a value. */ 


Section 49.2: Using goto to jump out of nested loops 


Jumping out of nested loops would usually require use of a boolean variable with a check for this variable in the 


loops. Supposing we are iterating over i and j, it could look like this 


size t i,j; 
for (i = ð; i < myValue && !breakout_condition; ++i) { 
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for (j = 8; j < mySecondValue && !breakout_condition; ++j) { 
. /* Do something, maybe modifying breakout_condition */ 
/*x When breakout_condition == true the loops end */ 


But the C language offers the goto clause, which can be useful in this case. By using it with a label declared after the 
loops, we can easily break out of the loops. 


size ta Je 
for (i = 0; i < myValue; ++i) { 
for (j = 8; j < mySecondValue; ++j) { 


if (breakout_condition) 
goto final; 
} 
} 


final: 


However, often when this need comes up a return could be better used instead. This construct is also considered 
"unstructured" in structural programming theory. 


Another situation where goto might be useful is for jumping to an error-handler: 


ptr = malloc(N * x); 
if('!ptr) 
goto out_of_memory; 


/* normal processing */ 
free(ptr); 
return SUCCESS; 


out_of_memory : 
free(ptr); /* harmless, and necessary if we have further errors */ 
return FAILURE; 


Use of goto keeps error flow separate from normal program control flow. It is however also considered 
"unstructured" in the technical sense. 


Section 49.3: Using break and continue 
Immediately continue reading on invalid input or break on user request or end-of-file: 


#include <stdlib.h> /* for EXIT_xxx macros */ 
#include <stdio.h> /* for printf() and getchar() */ 
#include <ctype.h> /* for isdigit() */ 


void flush_input_stream(FILE * fp); 


int main(void) 
{ 
int sum = @; 
printf("Enter digits to be summed up or @ to exit:\n"); 


do 


{ 
int c = getchar(); 


if (EOF == c) 
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{ 
printf("Read 'end-of-file', exiting!\n"); 


break; 
} 
if (‘\n' != c) 
{ 
flush_input_stream(stdin) ; 
} 


if (!isdigit(c)) 


printf("%c is not a digit! Start over!\n", c); 


continue; 

} 

ife (a EE) 

{ 
printf("Exit requested.\n"); 
break; 

} 

sum += c - '®'; 


printf("The current sum is %d.\n", sum); 
} while (1); 


return EXIT_SUCCESS; 
} 


void flush_input_stream(FILE * fp) 
{ 
size_t i = Qð; 
int c: 
while ((c = fgetc(fp)) != '\n' && c != EOF) /* Pull all until and including the next new-line. */ 
{ 
++i; 
} 
if (@ != i) 
{ 
fprintf(stderr, "Flushed %zu characters from input.\n", i); 
} 
} 
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i 50: Create and include header 
iles 


In modern C, header files are crucial tools that must be designed and used correctly. They allow the compiler to 
cross-check independently compiled parts of a program. 


Headers declare types, functions, macros etc that are needed by the consumers of a set of facilities. All the code 
that uses any of those facilities includes the header. All the code that defines those facilities includes the header. 
This allows the compiler to check that the uses and definitions match. 


Section 50.1: Introduction 


There are a number of guidelines to follow when creating and using header files in a C project: 


Idemopotence 


If a header file is included multiple times in a translation unit (TU), it should not break builds. 


Self-containment 


If you need the facilities declared in a header file, you should not have to include any other headers explicitly. 


Minimality 


You should not be able to remove any information from a header without causing builds to fail. 


Include What You Use (IWYU) 


Of more concern to C++ than C, but nevertheless important in C too. If the code in a TU (call it code.c) 
directly uses the features declared by a header (call it "headerA.h"), then code.c should #include 
"headerA.h" directly, even if the TU includes another header (call it "headerB.h") that happens, at the 
moment, to include "headerA.h". 


Occasionally, there might be good enough reasons to break one or more of these guidelines, but you should both 
be aware that you are breaking the rule and be aware of the consequences of doing so before you break it. 


Section 50.2: Self-containment 


Modern headers should be self-contained, which means that a program that needs to use the facilities defined by 
header .h can include that header (#include "header .h") and not worry about whether other headers need to be 
included first. 


Recommendation: Header files should be self-contained. 


Historical rules 


Historically, this has been a mildly contentious subject. 
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Once upon another millennium, the AT&T Indian Hill C Style and Coding Standards stated: 


Header files should not be nested. The prologue for a header file should, therefore, describe what other 
headers need to be #included for the header to be functional. In extreme cases, where a large number 
of header files are to be included in several different source files, it is acceptable to put all common 
#includes in one include file. 


This is the antithesis of self-containment. 
Modern rules 


However, since then, opinion has tended in the opposite direction. If a source file needs to use the facilities 
declared by a header header .h, the programmer should be able to write: 


#include "header.h" 


and (subject only to having the correct search paths set on the command line), any necessary pre-requisite headers 
will be included by header .h without needing any further headers added to the source file. 


This provides better modularity for the source code. It also protects the source from the "guess why this header 
was added" conundrum that arises after the code has been modified and hacked for a decade or two. 


The NASA Goddard Space Flight Center (GSFC) coding standards for C is one of the more modern standards — but 


is now a little hard to track down. It states that headers should be self-contained. It also provides a simple way to 
ensure that headers are self-contained: the implementation file for the header should include the header as the 
first header. If it is not self-contained, that code will not compile. 


The rationale given by GSFC includes: 


§2.1.1 Header include rationale 


This standard requires a unit’s header to contain #include statements for all other headers required by 
the unit header. Placing #include for the unit header first in the unit body allows the compiler to verify 
that the header contains all required #include statements. 


An alternate design, not permitted by this standard, allows no #include statements in headers; all 
#includes are done in the body files. Unit header files then must contain #ifdef statements that check 
that the required headers are included in the proper order. 


One advantage of the alternate design is that the #inc1lude list in the body file is exactly the dependency 
list needed in a makefile, and this list is checked by the compiler. With the standard design, a tool must be 
used to generate the dependency list. However, all of the branch recommended development 
environments provide such a tool. 


A major disadvantage of the alternate design is that if a unit’s required header list changes, each file that 
uses that unit must be edited to update the #include statement list. Also, the required header list for a 
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compiler library unit may be different on different targets. 


Another disadvantage of the alternate design is that compiler library header files, and other third party 
files, must be modified to add the required #ifdef statements. 


Thus, self-containment means that: 


e If a header header .h needs a new nested header extra.h, you do not have to check every source file that 
uses header .h to see whether you need to add extra.h. 

e Ifa header header .h no longer needs to include a specific header notneeded .h, you do not have to check 
every source file that uses header .h to see whether you can safely remove notneeded.h (but see Include 
what you use. 

e You do not have to establish the correct sequence for including the pre-requisite headers (which requires a 
topological sort to do the job properly). 


Checking self-containment 


See Linking against a static library for a script chkhdr that can be used to test idempotence and self-containment of 
a header file. 


Section 50.3: Minimality 


Headers are a crucial consistency checking mechanism, but they should be as small as possible. In particular, that 
means that a header should not include other headers just because the implementation file will need the other 
headers. A header should contain only those headers necessary for a consumer of the services described. 


For example, a project header should not include <stdio.h> unless one of the function interfaces uses the type 
FILE x (or one of the other types defined solely in <stdio.h>). If an interface uses size_t, the smallest header that 
suffices is <stddef .h>. Obviously, if another header that defines size_t is included, there is no need to include 
<stddef .h> too. 


If the headers are minimal, then it keeps the compilation time to a minimum too. 


It is possible to devise headers whose sole purpose is to include a lot of other headers. These seldom turn out to be 
a good idea in the long run because few source files will actually need all the facilities described by all the headers. 
For example, a <standard-c.h> could be devised that includes all the standard C headers — with care since some 
headers are not always present. However, very few programs actually use the facilities of <locale.h> or 
<tgmath.h>. 


e See also How to link multiple implementation files in C? 


Section 50.4: Notation and Miscellany 


The C standard says that there is very little difference between the #include <header.h> and #include 
"header.h" notations. 


[#include <header .h>] searches a sequence of implementation-defined places for a header identified 
uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that 
directive by the entire contents of the header. How the places are specified or the header identified is 
implementation-defined. 
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[#include "header.h"] causes the replacement of that directive by the entire contents of the source file 
identified by the specified sequence between the "..." delimiters. The named source file is searched for in 
an implementation-defined manner. If this search is not supported, or if the search fails, the directive is 
reprocessed as if it read [#include <header.h>]... 


So, the double quoted form may look in more places than the angle-bracketed form. The standard specifies by 
example that the standard headers should be included in angle-brackets, even though the compilation works if you 
use double quotes instead. Similarly, standards such as POSIX use the angle-bracketed format — and you should 
too. Reserve double-quoted headers for headers defined by the project. For externally-defined headers (including 
headers from other projects your project relies on), the angle-bracket notation is most appropriate. 


Note that there should be a space between #include and the header, even though the compilers will accept no 
space there. Spaces are cheap. 


A number of projects use a notation such as: 


#include <openssl/ssl1.h> 
#include <sys/stat.h> 
#include <linux/kernel.h> 


You should consider whether to use that namespace control in your project (it is quite probably a good idea). You 
should steer clear of the names used by existing projects (in particular, both sys and linux would be bad choices). 


If you use this, your code should be careful and consistent in the use of the notation. 
Do not use #include "../include/header.h" notation. 


Header files should seldom if ever define variables. Although you will keep global variables to a minimum, if you 
need a global variable, you will declare it in a header, and define it in one suitable source file, and that source file 
will include the header to cross-check the declaration and definition, and all source files that use the variable will 
use the header to declare it. 


Corollary: you will not declare global variables in a source file — a source file will only contain definitions. 


Header files should seldom declare static functions, with the notable exception of static inline functions which 
will be defined in headers if the function is needed in more than one source file. 


e Source files define global variables, and global functions. 

e Source files do not declare the existence of global variables or functions; they include the header that 
declares the variable or function. 

e Header files declare global variable and functions (and types and other supporting material). 

e Header files do not define variables or any functions except (static) inline functions. 


Cross-references 


e Where to document functions in C? 
e List of standard header files in C and C++ 
e ls inline without static or extern ever useful in C99? 


e How do | use extern to share variables between source files? 

e What are the benefits of a relative path such as " . . /include/header.h" for a header? 
e Header inclusion optimization 

e Should | include every header? 
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Section 50.5: Idempotence 


If a particular header file is included more than once in a translation unit (TU), there should not be any compilation 
problems. This is termed 'idempotence’; your headers should be idempotent. Think how difficult life would be if you 
had to ensure that #include <stdio.h> was only included once. 


There are two ways to achieve idempotence: header guards and the #pragma once directive. 
Header guards 


Header guards are simple and reliable and conform to the C standard. The first non-comment lines in a header file 
should be of the form: 


#ifndef UNIQUE_ID_FOR_HEADER 
#define UNIQUE_ID_FOR_HEADER 


The last non-comment line should be #endif, optionally with a comment after it: 
#endif /* UNIQUE_ID_FOR_HEADER */ 


All the operational code, including other #include directives, should be between these lines. 


Each name must be unique. Often, a name scheme such as HEADER_H_INCLUDED is used. Some older code uses a 
symbol defined as the header guard (e.g. #ifndef BUFSIZ in <stdio.h>), but it is not as reliable as a unique name. 


One option would be to use a generated MD5 (or other) hash for the header guard name. You should avoid 
emulating the schemes used by system headers which frequently use names reserved to the implementation — 
names Starting with an underscore followed by either another underscore or an upper-case letter. 


The #pragma once Directive 


Alternatively, some compilers support the #pragma once directive which has the same effect as the three lines 
shown for header guards. 


#pragma once 


The compilers which support #pragma once include MS Visual Studio and GCC and Clang. However, if portability is a 
concern, it is better to use header guards, or use both. Modern compilers (those supporting C89 or later) are 
required to ignore, without comment, pragmas that they do not recognize (‘Any such pragma that is not recognized 
by the implementation is ignored’) but old versions of GCC were not so indulgent. 


Section 50.6: Include What You Use (IWYU) 


Google's Include What You Use project, or IWYU, ensures source files include all headers used in the code. 


Suppose a source file source.c includes a header arbitrary.h which in turn coincidentally includes f reeloader .h, 
but the source file also explicitly and independently uses the facilities from freeloader .h. All is well to start with. 
Then one day arbitrary.h is changed so its clients no longer need the facilities of freeloader .h. Suddenly, 
source.c stops compiling — because it didn't meet the IWYU criteria. Because the code in source.c explicitly used 
the facilities of freeloader .h, it should have included what it uses — there should have been an explicit #include 
"freeloader.h" in the source too. (Idempotency would have ensured there wasn't a problem.) 


The IWYU philosophy maximizes the probability that code continues to compile even with reasonable changes 
made to interfaces. Clearly, if your code calls a function that is subsequently removed from the published interface, 
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no amount of preparation can prevent changes becoming necessary. This is why changes to APIs are avoided when 
possible, and why there are deprecation cycles over multiple releases, etc. 


This is a particular problem in C++ because standard headers are allowed to include each other. Source file 
file.cpp could include one header header1.h that on one platform includes another header header2.h. file.cpp 
might turn out to use the facilities of header2.h as well. This wouldn't be a problem initially - the code would 
compile because header1.h includes header2.h. On another platform, or an upgrade of the current platform, 
header1.h could be revised so it no longer includes header2.h, and thenfile.cpp would stop compiling as a result. 


IWYU would spot the problem and recommend that header2 .h be included directly in file.cpp. This would ensure 
it continues to compile. Analogous considerations apply to C code too. 
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Chapter 51: <ctype.h> — character 
classification & conversion 


Section 51.1: Introduction 


The header ctype.h is a part of the standard C library. It provides functions for classifying and converting 
characters. 


All of these functions take one parameter, an int that must be either FOF or representable as an unsigned char. 


The names of the classifying functions are prefixed with 'is'. Each returns an integer non-zero value (TRUE) if the 
character passed to it satisfies the related condition. If the condition is not satisfied then the function returns a zero 
value (FALSE). 


These classifying functions operate as shown, assuming the default C locale: 


int a; 

nit Ce=! Ay 

a = isalpha(c); /* Checks if c is alphabetic (A-Z, a-z), returns non-zero here. */ 

a = isalnum(c); /* Checks if c is alphanumeric (A-Z, a-z, 0-9), returns non-zero here. */ 

a = iscntrl(c); /* Checks is c is a control character (@x@@-@x1F, @x7F), returns zero here. */ 

a = isdigit(c); /* Checks if c is a digit (0-9), returns zero here. */ 

a = isgraph(c); /* Checks if c has a graphical representation (any printing character except space), 


returns non-zero here. */ 

a = islower(c); /* Checks if c is a lower-case letter (a-z), returns zero here. */ 

a = isprint(c); /* Checks if c is any printable character (including space), returns non-zero here. 
*/ 

a = isupper(c); /* Checks if c is a upper-case letter (a-z), returns zero here. */ 

a = ispunct(c); /* Checks if c is a punctuation character, returns zero here. */ 

a = isspace(c); /* Checks if c is a white-space character, returns zero here. */ 

a = isupper(c); /* Checks if c is an upper-case letter (A-Z), returns non-zero here. */ 

a = isxdigit(c); /* Checks if c is a hexadecimal digit (A-F, a-f, 9-9), returns non-zero here. */ 
Version = C99 


a = isblank(c); /* Checks if c is a blank character (space or tab), returns non-zero here. */ 


There are two conversion functions. These are named using the prefix 'to'. These functions take the same argument 
as those above. However the return value is not a simple zero or non-zero but the passed argument changed in 
some manner. 


These conversion functions operate as shown, assuming the default C locale: 


int a; 
int e= Ae = 


/* Converts c to a lower-case letter (a-z). 

x If conversion is not possible the unchanged value is returned. 
* Returns ‘a' here. 

*/ 
a = tolower(c); 


/* Converts c to an upper-case letter (A-Z). 

* If conversion is not possible the unchanged value is returned. 
* Returns 'A' here. 

*/ 
a = toupper(c); 
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The below information is quoted from cplusplus.com mapping how the original 127-character ASCII set is 
considered by each of the classifying type functions (a * indicates that the function returns non-zero for that 
character) 


ASCII 


values characters iscntrl isblank isspace isupper islower isalpha isdigit isxdigit isalnum ispunct isgraph isprint 
0x00 .. 

0x08 NUL, (other control codes) e 

0x09 tab (‘\t') ° ° ° 


OxOA .. (white-space control 
0x0D codes: '‘\f',"\v',"\n',‘\r') 


OxOE .. 
OxiE (other control codes) ° 
0x20 space('') . . 3 
0x21.. 
V#$HR'(*+ - r è 7 

Ox2F ! #$H&'()*4,-./ 
0x30... 
0x39 0123456789 . . ° . . 
Ox3a. 

"agaa . . . 
ox40 "e 
0x41 .. 
ox4g ABCDEF . . . . . . 
0x47 .. 
OxSA GHIJKLMNOPQRSTUVWXYZ . . ° . . 
0x5B ». pja > ; p p 
0x60 = 
0x61 .. 
0x66 abcdef . . . . . . 
0x67 .. hijklmnoparstuvwxyz ° ° ° ° ° 
oxza OY pq y 
Ox7B .. 
ox7e bar i ° ` 
0x7F (DEL) . 


Section 51.2: Classifying characters read from a stream 


#include <ctype.h> 
#include <stdio.h> 


typedef struct { 
size_t space; 
size_t alnum; 
size_t punct; 
} chartypes; 


chartypes classify(FILE *f) { 
chartypes types = { 0, 0, @ }; 


int ch; 


while ((ch = fgetc(f)) != EOF) { 


types.space += !!isspace(ch) ; 
types.alnum += !!isalnum(ch); 
types.punct += !!ispunct(ch) ; 


} 


return types; 


} 


The classify function reads characters from a stream and counts the number of spaces, alphanumeric and 
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punctuation characters. It avoids several pitfalls. 


e When reading a character from a stream, the result is saved as an int, since otherwise there would be an 
ambiguity between reading EOF (the end-of-file marker) and a character that has the same bit pattern. 

e The character classification functions (e.g. isspace) expect their argument to be either representable as an 
unsigned char, or the value of the EOF macro. Since this is exactly what the fgetc returns, there is no need for 
conversion here. 

e The return value of the character classification functions only distinguishes between zero (meaning false) 
and nonzero (meaning true). For counting the number of occurrences, this value needs to be converted to a 
1 or 0, which is done by the double negation, !!. 


Section 51.3: Classifying characters from a string 


#include <ctype.h> 
#include <stddef.h> 


typedef struct { 
size_t space; 
size_t alnum; 
size_t punct; 
} chartypes; 


chartypes classify(const char *s) { 
chartypes types = { @, ©, @ }; 
const char xp; 


for (p= s; p != '\O'; p++) { 
types.space += !!isspace((unsigned char)¥*p) ; 
types.alnum += !!isalnum((unsigned char)¥*p) ; 
types.punct += !!ispunct((unsigned char) <p) ; 


} 


return types; 


} 


The classify function examines all characters from a string and counts the number of spaces, alphanumeric and 
punctuation characters. It avoids several pitfalls. 


The character classification functions (e.g. isspace) expect their argument to be either representable as an 
unsigned char, or the value of the EOF macro. 

The expression *p is of type char and must therefore be converted to match the above wording. 

The char type is defined to be equivalent to either signed char or unsigned char. 

When char is equivalent to unsigned char, there is no problem, since every possible value of the char type is 
representable as unsigned char. 

When char is equivalent to signed char, it must be converted to unsigned char before being passed to the 
character classification functions. And although the value of the character may change because of this 
conversion, this is exactly what these functions expect. 

The return value of the character classification functions only distinguishes between zero (meaning false) 
and nonzero (meaning true). For counting the number of occurrences, this value needs to be converted to a 
1 or 0, which is done by the double negation, !!. 
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Chapter 52: Side Effects 


Section 52.1: Pre/Post Increment/Decrement operators 


In C, there are two unary operators - '++' and '--' that are very common source of confusion. The operator ++ is 
called the increment operator and the operator -- is called the decrement operator. Both of them can be used used 
in either prefix form or postfix form. The syntax for prefix form for ++ operator is ++operand and the syntax for 
postfix form is operand++. When used in the prefix form, the operand is incremented first by 1 and the resultant 
value of the operand is used in the evaluation of the expression. Consider the following example: 


int m X= 9; 

n = ++x; /* x is incremented by 1(x=6), and result is assigned to n(6) */ 
/x this is a short form for two statements: */ 
VES DE ESO Gea thee A 
Je Ta ee 2 LG) 


When used in the postfix form, the operand's current value is used in the expression and then the value of the 
operand is incremented by 1. Consider the following example: 
int n, x = 5; 
n = x++; /* value of x(5) is assigned first to n(5), and then x is incremented by 1; x(6) */ 
/* this is a short form for two statements: */ 


Mee) Soe Se 
Teka Xe Xe ry ms 27; 


The working of the decrement operator -- can be understood similarly. 
The following code demonstrates what each one does 


int main() 


{ 
int a, b, x = 42; 
a = ++x; /* a and x are 43 */ 
Die=axtt a/c DiS 43 Xe IS 4A 
a= x n /* a is ds 44, x is 43 */ 
b = --x; /* b and x are 42 */ 
return ð; 

} 


From the above it is clear that post operators return the current value of a variable and then modify it, but pre 
operators modify the variable and then return the modified value. 


In all versions of C, the order of evaluation of pre and post operators are not defined, hence the following code can 
return unexpected outputs: 


int main() 
{ 
int a, x = 42; 
a = xt+ + x; /* wrong */ 
a= xX + x; /* right */ 
int ar[10]; 


x=ð; 
ar[x] = x++; /* wrong */ 
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ar[xt+] = x; /* wrong */ 
arlis i= x: /= Gaght */ 
++x; 


, 


return @; 


Note that it is also good practice to use pre over post operators when used alone in a statement. Look at the above 
code for this. 


Note also, that when a function is called, all side effects on arguments must take place before the function runs. 


int foo(int x) 


{ 
return x; 
} 
int main() 
{ 
int a = 42: 
int b = foo(a++); /* This returns 43, even if it seems like it should return 42 */ 
return ð; 
} 
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Chapter 53: Multi-Character Character 
Sequence 


Section 53.1: Trigraphs 


The symbols [ ] { } * \ | ~ #are frequently used in C programs, but in the late 1980s, there were code sets in 
use (ISO 646 variants, for example, in Scandinavian countries) where the ASCII character positions for these were 
used for national language variant characters (e.g. £ for #in the UK; Æ A æ å ø @for{ } { } | \ in Denmark; 
there was no ~ in EBCDIC). This meant that it was hard to write C code on machines that used these sets. 


To solve this problem, the C standard suggested the use of combinations of three characters to produce a single 
character called a trigraph. A trigraph is a sequence of three characters, the first two of which are question marks. 


The following is a simple example that uses trigraph sequences instead of #, { and }: 


??=include <stdio.h> 
int main() 


IRL 


printf("Hello World!\n"); 
22> 


This will be changed by the C preprocessor by replacing the trigraphs with their single-character equivalents as if 
the code had been written: 


#include <stdio.h> 


int main() 


printf("Hello World!\n"); 
} 
Trigraph Equivalent 
??= # 
??/ \ 
??' ^ 
??( [ 
2?) ] 
2?! | 
??< { 
22> } 
??- = 


Note that trigraphs are problematic because, for example, ??/ is a backslash and can affect the meaning of 
continuation lines in comments, and have to be recognized inside strings and character literals (e.g. '??/??/' isa 
single character, a backslash). 


Section 53.2: Digraphs 


Version = C99 


In 1994 more readable alternatives to five of the trigraphs were supplied. These use only two characters and are 
known as digraphs. Unlike trigraphs, digraphs are tokens. If a digraph occurs in another token (e.g. string literals or 
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character constants) then it will not be treated as a digraph, but remain as it is. 
The following shows the difference before and after processing the digraphs sequence. 


#include <stdio.h> 


int main() 
<% 


printf("Hello %> World!\n"); /* Note that the string contains a digraph */ 
%> 


Which will be treated the same as: 


#include <stdio.h> 


int main() 


{ 
printf("Hello %> World!\n"); /* Note the unchanged digraph within the string. */ 
} 


Digraph Equivalent 
<: [ 


] 
{ 
%> } 
# 
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Chapter 54: Constraints 


Section 54.1: Duplicate variable names in the same scope 


An example of a constraint as expressed in the C standard is having two variables of the same name declared ina 
scope), for example: 


void foo(int bar) 


{ 
int var; 
double var; 


This code breaches the constraint and must produce a diagnostic message at compile time. This is very useful as 
compared to undefined behavior as the developer will be informed of the issue before the program is run, 
potentially doing anything. 


Constraints thus tend to be errors which are easily detectable at compile time such as this, issues which result in 
undefined behavior but would be difficult or impossible to detect at compile time are thus not constraints. 


1) exact wording: 


Version = C99 


If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type 
specifier) with the same scope and in the same name space, except for tags as specified in 6.7.2.3. 


Section 54.2: Unary arithmetic operators 


The unary + and - operators are only usable on arithmetic types, therefore if for example one tries to use them on 
a struct the program will produce a diagnostic eg: 


struct foo 


{ 
bool bar; 

mS 
void baz(void) 
{ 

struct foo testStruct; 

-testStruct; /* This breaks the constraint so must produce a diagnostic */ 
} 
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Chapter 55: Inlining 


ea 55.1: Inlining functions used in more than one source 
ile 


For small functions that get called often, the overhead associated with the function call can be a significant fraction 
of the total execution time of that function. One way of improving performance, then, is to eliminate the overhead. 


In this example we use four functions (plus main( )) in three source files. Two of those (plusfive() and timestwo()) 
each get called by the other two located in "source1.c" and "source2.c". The main() is included so we have a 
working example. 


main.c: 


#include <stdio.h> 
#include <stdlib.h> 
#include "headerfile.h" 


int main(void) { 
int start = 3; 
int intermediate = complicated1(start) ; 
printf("First result is %d\n", intermediate) ; 
intermediate = complicated2(start) ; 
printf("Second result is %d\n", intermediate); 
return ð; 


} 
source1.c: 


#include <stdio.h> 
#include <stdlib.h> 
#include "headerfile.h" 


int complicated1(int input) { 
int tmp = timestwo(input) ; 
tmp = plusfive(tmp) ; 
return tmp; 


} 
source2.c: 


#include <stdio.h> 
#include <stdlib.h> 
#include "headerfile.h" 


int complicated2(int input) { 
int tmp = plusfive(input) ; 
tmp = timestwo(tmp) ; 
return tmp; 


} 
headerfile.h: 


#ifndef HEADERFILE_H 
#define HEADERFILE_H 


int complicated1(int input) ; 
int complicated2(int input) ; 


inline int timestwo(int input) { 
return input * 2; 
} 


inline int plusfive(int input) { 
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} 


return input + 5; 


#endif 


Functions timestwo and plusfive get called by both complicated1 and complicated2, which are in different 
"translation units", or source files. In order to use them in this way, we have to define them in the header. 


Compile like this, assuming gcc: 


cc 
cc 
cc 
cc 


We 


-02 -std=c99 -c -o main.o main.c 

-02 -std=c99 -c -o sourcel.o sourcel.c 
-02 -std=c99 -c -o source2.o source2.c 
main.o sourcel.o source2.o -o main 


use the -O2 optimization option because some compilers don't inline without optimization turned on. 


The effect of the inline keyword is that the function symbol in question is not emitted into the object file. 
Otherwise an error would occur in the last line, where we are linking the object files to form the final executable. If 


we 


would not have inline, the same symbol would be defined in both .o files, and a "multiply defined symbol" 


error would occur. 


In situations where the symbol is actually needed, this has the disadvantage that the symbol is not produced at all. 
There are two possibilities to deal with that. The first is to add an extra extern declaration of the inlined functions 
in exactly one of the .c files. So add the following to source1. c: 


extern int timestwo(int input); 
extern int plusfive(int input); 


The other possibility is to define the function with static inline instead of inline. This method has the drawback 
that eventually a copy of the function in question may be produced in every object file that is produced with this 
header. 


GoalKicker.com - C Notes for Professionals 286 


Chapter 56: Unions 


Section 56.1: Using unions to reinterpret values 


Some C implementations permit code to write to one member of a union type then read from another in order to 
perform a sort of reinterpreting cast (parsing the new type as the bit representation of the old one). 


It is important to note however, this is not permitted by the C standard current or past and will result in undefined 
behavior, none the less is is a very common extension offered by compilers (so check your compiler docs if you plan 
to do this). 


One real life example of this technique is the "Fast Inverse Square Root" algorithm which relies on implementation 
details of IEEE 754 floating point numbers to perform an inverse square root more quickly than using floating point 
operations, this algorithm can be performed either through pointer casting (which is very dangerous and breaks 
the strict aliasing rule) or through a union (which is still undefined behavior but works in many compilers): 


union floatToInt 


{ 
int32_t intMember ; 
float floatMember; /* Float must be 32 bits IEEE 754 for this to work */ 
ie 
float inverseSquareRoot(float input) 
{ 
union floatToInt x; 
Mie eee ie 
float t: 
x.floatMember = input; /* Assign to the float member */ 
i = x.intMember ; /* Read back from the integer member */ 
i = Ox5f3759df - (i >> 1); 
x.intMember = i; /* Assign to the integer member */ 
f = x.floatMember ; /* Read back from the float member */ 
f=f * (1.5f = input = @.5f * f * f); 
return f * (1.5f - input * 0.5f «* f * f); 
} 


This technique was widely used in computer graphics and games in the past due to its greater speed compared to 
using floating point operations, and is very much a compromise, losing some accuracy and being very non portable 
in exchange for speed. 


Section 56.2: Writing to one union member and reading from 
another 


The members of a union share the same space in memory. This means that writing to one member overwrites the 
data in all other members and that reading from one member results in the same data as reading from all other 
members. However, because union members can have differing types and sizes, the data that is read can be 
interpreted differently, see 

http://stackoverflow.com/documentation/c/1 119/structs-and-unions/9399/using-unions-to-reinterpret-values 


The simple example below demonstrates a union with two members, both of the same type. It shows that writing 
to member m_1 results in the written value being read from member m_2 and writing to member m_2 results in the 
written value being read from member m_1. 


#include <stdio.h> 
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union my_union /* Define union */ 


{ 
int m_1; 
int m_2; 

y; 

int main (void) 

{ 
union my_union u; /* Declare union */ 
u.m_1 = 1; /* Write to m1 */ 
printf("u.m_2: %i\n", u.m_2); /* Read from m_2 */ 
u.m_2 = 2; /* Write to m2 */ 
printf("u.m_1: %i\n", u.m_1); /* Read from m_1 */ 
return @; 

} 

Result 

u.m_2: 1 

u.m_1: 2 


Section 56.3: Difference between struct and union 
This illustrates that union members shares memory and that struct members does not share memory. 


#include <stdio.h> 
#include <string.h> 


union My_Union 

{ 

int variable_1; 
int variable_2; 


H 


struct My_Struct 
{ 

int variable_1; 
int variable_2; 


H 


int main (void) 
{ 
union My_Union u; 
struct My_Struct s; 
u.variable_1 = 1; 
u.variable_2 = 2; 
s.variable_1 = 1; 
s.variable_2 = 2; 
printf ("u.variable_1: %i\n", u.variable_1); 
printf ("u.variable_2: %i\n", u.variable_2) ; 
printf ("s.variable_1: %i\n", s.variable_1); 
printf ("s.variable_2: %i\n", s.variable_2); 
printf ("sizeof (union My_Union): %i\n", sizeof (union My_Union)) ; 
printf ("sizeof (struct My_Struct): %i\n", sizeof (struct My_Struct)) ; 
return @; 
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Chapter 57: Threads (native) 


Section 57.1: Inititialization by one thread 


In most cases all data that is accessed by several threads should be initialized before the threads are created. This 
ensures that all threads start with a clear state and no race condition occurs. 


If this is not possible once_flag and call_once can be used 


#include <threads.h> 
#include <stdlib.h> 


// the user data for this example 
double const* Big = @; 


// the flag to protect big, must be global and/or static 
static once_flag onceBig = ONCE_INIT; 


void destroyBig(void) { 
free((void*)Big) ; 
} 


void initBig(void) { 
// assign to temporary with no const qualification 
double* b = malloc(largeNum) ; 
if (!b) { 
perror("allocation failed for Big"); 
exit (EXIT_FAILURE) ; 
} 
// now initialize and store Big 
initializeBigWithSophisticatedValues(largeNum, b); 
Big = b; 
// ensure that the space is freed on exit or quick_exit 
atexit(destroyBig) ; 
at_quick_exit (destroyBig) ; 
} 


// the user thread function that relies on Big 
int myThreadFunc(void* a) { 
call_once(&onceBig, initBig) ; 
// only use Big from here on 
return 0; 
The once_flag is used to coordinate different threads that might want to initialize the same data Big. The call to 


call_once guarantees that 


e initBig is called exactly once 
e call_once blocks until such a call to initBig has been made, either by the same or another thread. 


Besides allocation, a typical thing to do in such a once-called function is a dynamic initialization of a thread control 
data structures such as mtx_t or cnd_t that can't be initialized statically, using mtx_init or cnd_init, respectively. 


Section 57.2: Start several threads 


#include <stdio.h> 
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#include <threads.h> 
#include <stdlib.h> 


struct my_thread_data { 
double factor; 


ee 


int my_thread_func(void* a) { 
struct my_thread_data* d = 
// do something with d 
printf("we found %g\n", d->factor) ; 
// return an success or error code 
return d->factor > 1.0; 


a; 


int main(int argc, char» argv[argce+1]) { 
unsigned n = 4; 
if (argc > 1) n = strtoull(argv[1], ©, ©); 
// reserve space for the arguments for the threads 
struct my_thread_data D[n]; // can't be initialized 
for (unsigned i = ð; i < n; ++i) { 
D[i] = (struct my_thread_data){ .factor = @.5*i, }; 
} 
// reserve space for the ID's of the threads 
thrd_t id[4]; 
// launch the threads 
for (unsigned i = 0; i < n; ++i) { 
thrd_create(&id[i], my_thread_func, &D[i]); 
} 
// Wait that all threads have finished, but throw away their 
// return values 
for (unsigned i = ð; i < n; ++i) { 
thrd_join(id[i], @); 
} 
return EXIT_SUCCESS ; 
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Chapter 58: Multithreading 


In C11 there is a standard thread library, <threads.h>, but no known compiler that yet implements it. Thus, to use 
multithreading in C you must use platform specific implementations such as the POSIX threads library (often 
referred to as pthreads) using the pthread.h header. 


Section 58.1: C11 Threads simple example 


#include <threads.h> 
#include <stdio.h> 


int run(void *arg) 


printf("Hello world of C11 threads."); 
return @; 
} 
int main(int argc, const char *argv[]) 
thrd_t thread; 
int result; 
thrd_create(&thread, run, NULL); 
thrd_join(&thread, &result) ; 
printf("Thread return %d at the end\n", result); 
} 
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Chapter 59: Interprocess Communication 
(IPC) 


Inter-process communication (IPC) mechanisms allow different independent processes to communicate with each 
other. Standard C does not provide any IPC mechanisms. Therefore, all such mechanisms are defined by the host 
operating system. POSIX defines an extensive set of IPC mechanisms; Windows defines another set; and other 
systems define their own variants. 


Section 59.1: Semaphores 


Semaphores are used to synchronize operations between two or more processes. POSIX defines two different sets 
of semaphore functions: 


1. ‘System V IPC' — semct1(), semop(), semget(). 
2. 'POSIX Semaphores' — sem_close(), sem_destroy(), sem_getvalue(), sem_init(), sem_open(), sem_post(), 
sem_trywait(), sem_unlink(). 


This section describes the System V IPC semaphores, so called because they originated with Unix System V. 


First, you'll need to include the required headers. Old versions of POSIX required #include <sys/types.h>; modern 
POSIX and most systems do not require it. 


#include <sys/sem.h> 
Then, you'll need to define a key in both the parent as well as the child. 


#define KEY @x1111 


This key needs to be the same in both programs or they will not refer to the same IPC structure. There are ways to 
generate an agreed key without hard-coding its value. 


Next, depending on your compiler, you may or may not need to do this step: declare a union for the purpose of 
semaphore operations. 


union semun { 
int val; 
struct semid_ds *buf; 
unsigned short «array; 


ne 


Next, define your try (semwait) and raise (semsignal) structures. The names P and V originate from Dutch 


struct sembuf p 


@, -1, SEM_UNDO}; # semwait 
struct sembuf v = 7) 


{ 
{ @, +1, SEM_UNDO}; # semsignal 
Now, start by getting the id for your IPC semaphore. 


int id; 
// 2nd argument is number of semaphores 
// 3rd argument is the mode (IPC_CREAT creates the semaphore set if needed) 
if ((id = semget(KEY, 1, @666 | IPC_CREAT) < @) { 
/* error handling code */ 


} 
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In the parent, initialise the semaphore to have a counter of 1. 


union semun u; 
u-val = 1; 
if (semctl(id, ©, SETVAL, u) < ©) { // SETVAL is a macro to specify that you're setting the value of 
the semaphore to that specified by the union u 
/* error handling code */ 


} 


Now, you can decrement or increment the semaphore as you need. At the start of your critical section, you 
decrement the counter using the semop() function: 


if (semop(id, &p, 1) < ð) { 
/* error handling code */ 


} 
To increment the semaphore, you use &v instead of &p: 


if (semop(id, &v, 1) < @) { 
/* error handling code */ 


} 


Note that every function returns 8 on success and -1 on failure. Not checking these return statuses can cause 
devastating problems. 


Example 1.1: Racing with Threads 


The below program will have a process fork a child and both parent and child attempt to print characters onto the 
terminal without any synchronization. 


#include <stdio.h> 
#include <stdlib.h> 
#include <unistd.h> 
#include <string.h> 


int main() 
{ 
int pid; 
pid = fork(); 
srand(pid) ; 
if(pid < ð) 
{ 
perror("fork"); exit(1); 


} 
else if (pid) 


char *s = "abcdefgh"; 

int 1 = strlen(s); 

for(int i = @; i < 1; ++i) 

{ 
putchar(s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 
putchar(s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 


} 
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else 


{ 
char *s = "ABCDEFGH"; 
int 1 = strlen(s); 
fonnt i= Oa < 1e tti) 
{ 
putchar (s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 
putchar(s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 
} 
} 


Output (1st run): 
aAABaBCbCbDDcEEcddeFFGGHHef fgghh 
(2nd run): 

aabbccAABddBCeeCf fgDDghEEhFFGGHH 


Compiling and running this program should give you a different output each time. 
Example 1.2: Avoid Racing with Semaphores 
Modifying Example 1.1 to use semaphores, we have: 


#include <stdio.h> 
#include <stdlib.h> 
#include <unistd.h> 
#include <string.h> 
#include <sys/types.h> 
#include <sys/ipc.h> 
#include <sys/sem.h> 


#define KEY @x1111 


union semun { 
int val; 
struct semid_ds *buf; 
unsigned short «array; 


ie 
struct sembuf p = { ©, -1, SEM_UNDO}; 
struct sembuf v = { O, +1, SEM_UNDO}; 


int main() 
{ 
int id = semget(KEY, 1, 0666 | IPC_CREAT) ; 
if(id < @) 
{ 
perror("semget"); exit(11); 
} 
union semun u; 
üuval = 12 
if(semctl(id, @, SETVAL, u) < @) 
{ 
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perror("semctl"); exit(12); 


} 
int pid; 
pid = fork(); 
srand(pid) ; 
if(pid < @) 
{ 
perror("fork"); exit(1); 
} 
else if (pid) 
{ 
char *s = "abcdefgh"; 
int 1 = strlen(s); 
fornt a = (Os si < Ve Fri) 
{ 
if(semop(id, &p, 1) < @) 
{ 
perror("semop p"); exit(13); 
} 
putchar(s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 
putchar(s[i]); 
fflush(stdout) ; 
if(semop(id, &v, 1) < @) 
perror("semop p"); exit(14); 
} 
sleep(rand() % 2); 
} 
} 
else 
{ 
char *s = "ABCDEFGH"; 
int 1 = strlen(s); 
for(int i = @; i < 1; ++i) 
{ 
if(semop(id, &p, 1) < @) 
{ 
perror("semop p"); exit(15); 
} 
putchar(s[i]); 
fflush(stdout) ; 
sleep(rand() % 2); 
putchar(s[i]); 
fflush(stdout) ; 
if(semop(id, &v, 1) < @) 
{ 
perror('"semop p"); exit(16); 
} 
sleep(rand() % 2); 
} 
} 
} 
Output: 


aabbAABBCCccddeeDDf fEEFFGGHHgghh 
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Compiling and running this program will give you the same output each time. 
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Chapter 60: Testing frameworks 


Many developers use unit tests to check that their software works as expected. Unit tests check small units of larger 
pieces of software, and ensure that the outputs match expectations. Testing frameworks make unit testing easier 
by providing set-up/tear-down services and coordinating the tests. 


There are many unit testing frameworks available for C. For example, Unity is a pure C framework. People quite 
often use C++ testing frameworks to test C code; there are many C++ test frameworks too. 


Section 60.1: Unity Test Framework 


Unity is an xUnit-style test framework for unit testing C. It is written completely in C and is portable, quick, simple, 
expressive and extensible. It is designed to especially be also useful for unit testing for embedded systems. 


A simple test case that checks the return value of a function, might look as follows 


void test_FunctionUnderTest_should_ReturnFive( void) 


{ 
TEST_ASSERT_EQUAL_INT( 5, FunctionUnderTest() ); 


} 
A full test file might look like: 


#include "unity.h" 
#include "UnitUnderTest.h" /* The unit to be tested. */ 


void setUp (void) {} /* Is run before every test, put unit init calls here. */ 
void tearDown (void) {} /* Is run after every test, put unit clean-up calls here. */ 


void test_TheFirst(void) 


{ 
TEST_IGNORE_MESSAGE("Hello world!"); /* Ignore this test but print a message. */ 
} 
int main (void) 
{ 
UNITY_BEGIN() ; 
RUN_TEST(test_TheFirst); /* Run the test. */ 
return UNITY_END(); 
} 


Unity comes with some example projects, makefiles and some Ruby rake scripts that help make creating longer test 
files a bit easier. 


Section 60.2: CMocka 


CMocka is an elegant unit testing framework for C with support for mock objects. It only requires the standard C 
library, works on a range of computing platforms (including embedded) and with different compilers. It has a 
tutorial on testing with mocks, API documentation, and a variety of examples. 


#include <stdarg.h> 
#include <stddef.h> 
#include <setjmp.h> 
#include <cmocka.h> 


void null_test_success (void ** state) {} 
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void null_test_fail (void ** state) 
{ 
assert_true (@); 


} 


/*x These functions will be used to initialize 
and clean resources up after each test run */ 
int setup (void ** state) 


{ 
return ð; 
} 
int teardown (void ** state) 
{ 
return @; 
} 


int main (void) 


const struct CMUnitTest tests [] = 
{ 


cmocka_unit_test (null_test_success), 
cmocka_unit_test (null_test_fail), 


fe 


/x If setup and teardown functions are not 
needed, then NULL may be passed instead */ 


int count_fail_tests = 
cmocka_run_group_tests (tests, setup, teardown); 


return count_fail_tests; 


Section 60.3: CopUTest 


CppUTest is an xUnit-style framework for unit testing C and C++. It is written in C++ and aims for portability and 
simplicity in design. It has support for memory leak detection, building mocks, and running its tests along with the 
Google Test. Comes with helper scripts and sample projects for Visual Studio and Eclipse CDT. 


#include <CppUTest/CommandLineTestRunner .h> 
#include <CppUTest/TestHarness.h> 


TEST_GROUP(Foo_Group) {} 

TEST(Foo_Group, Foo_TestOne) {} 

/* Test runner may be provided options, such 
as to enable colored output, to run only a 
specific test or a group of tests, etc. This 
will return the number of failed tests. */ 

int main(int argc, char ** argv) 


{ 
RUN_ALL_TESTS(argc, argv); 


} 
A test group may have a setup() and a teardown( ) method. The setup method is called prior to each test and the 
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teardown() method is called after. Both are optional and either may be omitted independently. Other methods and 
variables may also be declared inside a group and will be available to all tests of that group. 


TEST_GROUP( Foo_Group) 


{ 
size_t data_bytes = 128; 
void * data; 


void setup() 


{ 
data = malloc(data_bytes) ; 


} 


void teardown() 


{ 
free(data) ; 


} 


void clear() 


{ 
memset(data, ©, data_bytes) ; 
} 
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Chapter 61: Valgrind 


Section 61.1: Bytes lost -- Forgetting to free 
Here is a program that calls malloc but not free: 


#include <stdio.h> 
#include <stdlib.h> 


int main(int argc, char **argv) 


{ 
char *s; 
s = malloc(26); // the culprint 
return ð; 

} 


With no extra arguments, valgrind will not look for this error. 


But if we turn on --leak-check=yes or --tool=memcheck, it will complain and display the lines responsible for those 
memory leaks if the program was compiled in debug mode: 


$ valgrind -q --leak-check=yes ./missing free 
==4776== 26 bytes in 1 blocks are definitely lost in loss record 1 of 1 


==4776== at 0x4024F20: malloc (vg_ replace malloc.c:236) 
==4776== by 0x80483F8: main (missing free.c:9) 
==4776== 


If the program is not compiled in debug mode (for example with the -g flag in GCC) it will still show us where the 
leak happened in terms of the relevant function, but not the lines. 


This lets us go back and look at what block was allocated in that line and try to trace forward to see why it wasn't 
freed. 


Section 61.2: Most common errors encountered while using 
Valgrind 


Valgrind provides you with the lines at which the error occurred at the end of each line in the format 
(file.c:line_no). Errors in valgrind are summarised in the following way: 


ERROR SUMMARY: 1 errors from 1 contexts (suppressed: @ from @) 
The most common errors include: 


1. Illegal read/write errors 


==8451== Invalid read of size 2 


==8451== at 0x4E7381D: getenv (getenv.c:84) 

==8451== by @x4EB1559: _ libc_ message (libc_ fatal.c:80) 
==8451== by Ox4F5256B: _ fortify fail (fortify _fail.c:37) 
==8451== by Ox4F5250F: — stack chk fail (stack chk fail.c:28) 
==8451== by @x40059C: main (valg.c:10) 


==8451== Address 0x700000007 is not stack'd, malloc'd or (recently) free'd 
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This happens when the code starts to access memory which does not belong to the program. The size of the 
memory accessed also gives you an indication of what variable was used. 


2. Use of Uninitialized Variables 


==8795== 1 errors in context 5 of 8: 
==8795== Conditional jump or move depends on uninitialised value(s) 


==8795== at Ox4E881AF: vfprintf (vfprintf.c:1631) 
==8795== by Ox4E8F898: printf (printf.c:33) 
==8795== by 0x400548: main (valg.c:7) 


According to the error, at line 7 of the main of valg.c, the call to printf() passed an uninitialized variable to 
printf. 


3. Illegal freeing of Memory 


==8954== Invalid free() / delete / delete[] / realloc() 


==8954== at Ox4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==8954== by 0x4005A8: main (valg.c:10) 

==8954== Address 0x5203040 is 0 bytes inside a block of size 240 free'd 

==8954== at Ox4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==8954== by @x40059C: main (valg.c:9) 

==8954== Block was alloc'd at 

==8954== at Ox4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==8954== by 0x40058C: main (valg.c:7) 


According to valgrind, the code freed the memory illegally (a second time) at line 10 of valg.c, whereas it was 
already freed at /ine 9, and the block itself was allocated memory at line 7. 


Section 61.3: Running Valgrind 
valgrind ./my-program arg1 arg2 < test-input 


This will run your program and produce a report of any allocations and de-allocations it did. It will also warn you 
about common errors like using uninitialized memory, dereferencing pointers to strange places, writing off the end 
of blocks allocated using malloc, or failing to free blocks. 


Section 61.4: Adding flags 
You can also turn on more tests, such as: 
valgrind -q --tool=memcheck --leak-check=yes ./my-program arg1 arg2 < test-input 


See valgrind --help for more information about the (many) options, or look at the documentation at 
http://valgrind.org/ for detailed information about what the output means. 
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Chapter 62: Common C programming 
idioms and developer practices 


Section 62.1: Comparing literal and variable 


Suppose you are comparing value with some variable 


if ( i == 2) //Bad-way 
{ 

doSomething ; 
} 


Now suppose you have mistaken == with =. Then it will take your sweet time to figure it out. 
if( 2 == i) //Good-way 


doSomething ; 


Then, if an equal sign is accidentally left out, the compiler will complain about an “attempted assignment to literal.” 
This won't protect you when comparing two variables, but every little bit helps. 


See here for more info. 


Section 62.2: Do not leave the parameter list of a function 
blank — use void 


Suppose you are creating a function that requires no arguments when it is called and you are faced with the 
dilemma of how you should define the parameter list in the function prototype and the function definition. 


e You have the choice of keeping the parameter list empty for both prototype and definition. Thereby, they 
look just like the function call statement you will need. 


e You read somewhere that one of the uses of keyword void (there are only a few of them), is to define the 
parameter list of functions that do not accept any arguments in their call. So, this is also a choice. 


So, which is the correct choice? 
ANSWER: using the keyword void 


GENERAL ADVICE: If a language provides certain feature to use for a special purpose, you are better off using that 
in your code. For example, using enums instead of #define macros (that's for another example). 


C11 section 6.7.6.3 "Function declarators", paragraph 10, states: 


The special case of an unnamed parameter of type void as the only item in the list specifies that the 
function has no parameters. 


Paragraph 14 of that same section shows the only difference: 


... An empty list in a function declarator that is part of a definition of that function specifies that the 
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function has no parameters. The empty list in a function declarator that is not part of a definition of that 
function specifies that no information about the number or types of the parameters is supplied. 


A simplified explanation provided by K&R (pgs- 72-73) for the above stuff: 


Furthermore, if a function declaration does not include arguments, as in 

double atof();, that too is taken to mean that nothing is to be assumed about the arguments of atof; all 
parameter checking is turned off. This special meaning of the empty argument list is intended to permit 
older C programs to compile with new compilers. But it's a bad idea to use it with new programs. If the 
function takes arguments, declare them; if it takes no arguments, use void. 


So this is how your function prototype should look: 
int foo(void) ; 
And this is how the function definition should be: 


int foo(void) 


{ 
<statements> 
return 1; 
One advantage of using the above, over int foo() type of declaration (ie. without using the keyword void), is that 
the compiler can detect the error if you call your function using an erroneous statement like foo(42). This kind of a 


function call statement would not cause any errors if you leave the parameter list blank. The error would pass 
silently, undetected and the code would still execute. 


This also means that you should define the main( ) function like this: 


int main(void) 


{ 
<statements> 
return @; 
Note that even though a function defined with an empty parameter list takes no arguments, it does not provide a 


prototype for the function, so the compiler will not complain if the function is subsequently called with arguments. 
For example: 


#include <stdio.h> 


static void parameterless() 


{ 


printf("%s called\n", __func__); 


} 


int main(void) 


{ 


parameterless(3, "arguments", "provided"); 
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return @; 


If that code is saved in the file proto79.c, it can be compiled on Unix with GCC (version 7.1.0 on macOS Sierra 
10.12.5 used for demonstration) like this: 


$ gcc -03 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -pedantic proto79.c -o proto79 
$ 


If you compile with more stringent options, you get errors: 


$ gcc -03 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes -Wold-style- 
definition -pedantic proto79.c -o proto79 

proto79.c:3:13: error: function declaration isn’t a prototype [-Werror=strict-prototypes] 

static void parameterless() 

proto79.c: In function ‘parameterless’: 

proto79.c:3:13: error: old-style function definition [-Werror=old-style-definition] 

cc1: all warnings being treated as errors 


$ 
If you give the function the formal prototype static void parameterless(void), then the compilation gives errors: 


$ gcc -03 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes -Wold-style- 
definition -pedantic proto79.c -o proto79 
proto79.c: In function ‘main’: 
proto79.c:10:5: error: too many arguments to function ‘parameterless’ 
parameterless(3, "arguments", "provided"); 


RRR RN RR RR Re 


proto79.c:3:13: note: declared here 
static void parameterless(void) 


RRR RN RR RR Re 


Moral — always make sure you have prototypes, and make sure your compiler tells you when you are not obeying 
the rules. 
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Chapter 63: Common pitfalls 


This section discusses some of the common mistakes that a C programmer should be aware of and should avoid 
making. For more on some unexpected problems and their causes, please see Undefined behavior 


Section 63.1: Mixing signed and unsigned integers in arithmetic 
operations 


It is usually not a good idea to mix signed and unsigned integers in arithmetic operations. For example, what will be 
output of following example? 


#include <stdio.h> 


int main(void) 


{ 
unsigned int a = 1000; 
signed int b = -1; 
if (a > b) puts("a is more than b"); 
else puts("a is less or equal than b"); 
return ð; 

} 


Since 1000 is more than -1 you would expect the output to bea is more than b, however that will not be the case. 


Arithmetic operations between different integral types are performed within a common type defined by the so 
called usual arithmetic conversions (see the language specification, 6.3.1.8). 


In this case the "common type" is unsigned int, Because, as stated in Usual arithmetic conversions, 


714 Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the 
type of the other operand, then the operand with signed integer type is converted to the type of the 
operand with unsigned integer type. 


This means that int operand b will get converted to unsigned int before the comparison. 


When -1 is converted to an unsigned int the result is the maximal possible unsigned int value, which is greater 
than 1000, meaning thata > b is false. 


Section 63.2: Macros are simple string replacements 


Macros are simple string replacements. (Strictly speaking, they work with preprocessing tokens, not arbitrary 
strings.) 


#include <stdio.h> 
#define SQUARE(x) x*x 
int main(void) { 


printf("%d\n", SQUARE(1+2)); 
return ð; 
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You may expect this code to print 9 (3*3), but actually 5 will be printed because the macro will be expanded to 
1+2*1+2. 


You should wrap the arguments and the whole macro expression in parentheses to avoid this problem. 


#include <stdio.h> 
#define SQUARE(x) ((x)*(x)) 
int main(void) { 


printf("%d\n", SQUARE(1+2)); 
return ð; 


Another problem is that the arguments of a macro are not guaranteed to be evaluated once; they may not be 
evaluated at all, or may be evaluated multiple times. 


#include <stdio.h> 


#define MIN(x, y) ((x) < 
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~ 
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int main(void) { 
int a = ð; 
printf("%d\n", MIN(a++, 10)); 
printf("a = %d\n", a); 

return ð; 


In this code, the macro will be expanded to ( (a++) <= (10) ? (a++) : (18)). Since a++ (8) is smaller than 10, a++ 
will be evaluated twice and it will make the value of a and what is returned from MIN differ from you may expect. 


This can be avoided by using functions, but note that the types will be fixed by the function definition, whereas 
macros can be (too) flexible with types. 


#include <stdio.h> 


int min(int x, int y) { 
return x <= y ? x: y; 


} 


int main(void) { 
int a = ð; 
printf("%d\n", min(a++, 10)); 
printf (ta = %d\ná a); 

return ð; 


Now the problem of double-evaluation is fixed, but this min function cannot deal with double data without 
truncating, for example. 


Macro directives can be of two types: 


#define OBJECT_LIKE_MACRO followed by a "replacement list" of preprocessor tokens 
#define FUNCTION_LIKE_MACRO(with, arguments) followed by a replacement list 


What distinguishes these two types of macros is the character that follows the identifier after #define: if it's an 
lparen, it is a function-like macro; otherwise, it's an object-like macro. If the intention is to write a function-like 


Goalkicker.com - C Notes for Professionals 306 


macro, there must not be any white space between the end of the name of the macro and (. Check this for a 
detailed explanation. 


Version = C99 


In C99 or later, you could use static inline int min(int x, int y) { ...}. 


Version = C11 
In C11, you could write a 'type-generic' expression for min. 


#include <stdio.h> 


#define min(x, y) _Generic((x), \ 
long double: min_ld, \ 
unsigned long long: min_ull, \ 
default: min_i \ 


J y) 


#define gen_min(suffix, type) \ 
static inline type min_##suffix(type x, type y) { return (x< y)? x: y; } 


gen_min(1l1d, long double) 
gen_min(ull, unsigned long long) 


gen_min(i, int) 


int main(void) 


{ 
unsigned long long ull1 = 5@ULL; 
unsigned long long ull2 = 37ULL; 
printf("min(%llu, %llu) = %llu\n", ull1, ul12, min(ull1, ull12)); 
long double 1d1 = 3.141592653L; 
long double 1d2 = 3.141592652L; 
printf("min(%.10Lf, %.10Lf) = %.1@Lf\n", 1d1, 1d2, min(1d1, 1d2)); 
int i1 = 3141653; 
int i2 = 3141652; 
printf("min(%d, %d) = %d\n", il, i2, min(il, i2)); 
return ð; 
} 


The generic expression could be extended with more types such as double, float, long long, unsigned long, long, 


unsigned — and appropriate gen_min macro invocations written. 


Section 63.3: Forgetting to copy the return value of realloc 


into a temporary 


If realloc fails, it returns NULL. If you assign the value of the original buffer to realloc's return value, and if it 


returns NULL, then the original buffer (the old pointer) is lost, resulting in a memory leak. The solution is to copy into 


a temporary pointer, and if that temporary is not NULL, then copy into the real buffer. 


char *buf, *tmp; 


buf = malloc(...); 


/* WRONG */ 
if ((buf = realloc(buf, 16)) == NULL) 
perror("realloc") ; 
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/* RIGHT */ 

if ((tmp = realloc(buf, 16)) != NULL) 
buf = tmp; 

else 
perror("realloc") ; 


Section 63.4: Forgetting to allocate one extra byte for \O 
When you are copying a string into a malloced buffer, always remember to add 1 to strlen. 


char *dest = malloc(strlen(src)); /* WRONG */ 
char *dest = malloc(strlen(src) + 1); /* RIGHT */ 


strcepy(dest, src); 


This is because strlen does not include the trailing \@ in the length. If you take the WRONG (as shown above) 
approach, upon calling strcpy, your program would invoke undefined behaviour. 


It also applies to situations when you are reading a string of known maximum length from stdin or some other 
source. For example 


#define MAX_INPUT_LEN 42 


char buffer[MAX_INPUT_LEN]; /* WRONG */ 
char buffer[MAX_INPUT_LEN + 1]; /* RIGHT */ 


scanf("%42s", buffer); /* Ensure that the buffer is not overflowed */ 


Section 63.5: Misunderstanding array decay 


A common problem in code that uses multidimensional arrays, arrays of pointers, etc. is the fact that Typex* and 
Type[M][N] are fundamentally different types: 


#include <stdio.h> 


void print_strings(char **strings, size_t n) 


{ 
size t I: 
for i=- 0; 1< m; itt) 

puts(strings[i]); 

} 

int main(void) 

{ 
char s[4][20] = {"Example 1", "Example 2", "Example 3", "Example 4"}; 
print_strings(s, 4); 
return ð; 

} 


Sample compiler output: 


filel.c: In function ‘main’: 
file1.c:13:23: error: passing argument 1 of ‘'print_strings' from incompatible pointer type |- 
Wincompatible-pointer-types | 

print_strings(strings, 4); 


A 


filel.c:3:10: note: expected ‘char **' but argument is of type ‘char (*)[20]' 
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void print_strings(char **strings, size_t n) 


The error states that the s array in the main function is passed to the function print_strings, which expects a 
different pointer type than it received. It also includes a note expressing the type that is expected by print_strings 
and the type that was passed to it from main. 


The problem is due to something called array decay. What happens when s with its type char [4] [20] (array of 4 
arrays of 20 chars) is passed to the function is it turns into a pointer to its first element as if you had written &s[9], 
which has the type char (*) [20] (pointer to 1 array of 20 chars). This occurs for any array, including an array of 
pointers, an array of arrays of arrays (3-D arrays), and an array of pointers to an array. Below is a table illustrating 
what happens when an array decays. Changes in the type description are highlighted to illustrate what happens: 


Before Decay After Decay 
char [20] array of (20 chars) char * pointer to (1 char) 
char [4][20] array of (4 arrays of 20 chars) char (*)[20] pointer to (1 array of 20 chars) 
char *[4] array of (4 pointers to 1 char) char ** pointer to (1 pointer to 1 char) 


array of (3 arrays of 4 arrays of 20 
chars) 


pointer to (1 array of 4 arrays of 20 


char [3][4][2@] chars) 


char (*)[4][20] 


array of (4 pointers to 1 array of 20 
chars) 


pointer to (1 pointer to 1 array of 20 


char (*[4])[20] chars) 


char (**)[20] 


If an array can decay to a pointer, then it can be said that a pointer may be considered an array of at least 1 
element. An exception to this is a null pointer, which points to nothing and is consequently not an array. 


Array decay only happens once. If an array has decayed to a pointer, it is now a pointer, not an array. Even if you 
have a pointer to an array, remember that the pointer might be considered an array of at least one element, so 
array decay has already occurred. 


In other words, a pointer to an array (char (*)[2@]) will never become a pointer to a pointer (char **). To fix the 
print_strings function, simply make it receive the correct type: 


void print_strings(char (*strings)[2@], size_t n) 
/* OR */ 
void print_strings(char strings[][20], size_t n) 


A problem arises when you want the print_strings function to be generic for any array of chars: what if there are 
30 chars instead of 20? Or 50? The answer is to add another parameter before the array parameter: 


#include <stdio.h> 


Note the rearranged parameters and the change in the parameter name 
from the previous definitions: 
n (number of strings) 
=> scount (string count) 


Of course, you could also use one of the following highly recommended forms 
for the ‘strings’ parameter instead: 


char strings[scount][ccount] 
char strings[ ][ccount] 


void print_strings(size_t scount, size_t ccount, char (*strings)[ccount]) 


{ 
size_t i; 
for (i = 0; i < scount; i++) 
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puts(strings[i]); 


} 

int main(void) 

{ 
char s[4][2@] = {"Example 1", “Example 2", "Example 3", "Example 4"}; 
print_strings(4, 20, s); 
return @; 

} 


Compiling it produces no errors and results in the expected output: 


Example 1 
Example 2 
Example 3 
Example 4 


Section 63.6: Forgetting to free memory (memory leaks) 


A programming best practice is to free any memory that has been allocated directly by your own code, or implicitly 
by calling an internal or external function, such as a library API like strdup(). Failing to free memory can introduce 
a memory leak, which could accumulate into a substantial amount of wasted memory that is unavailable to your 
program (or the system), possibly leading to crashes or undefined behavior. Problems are more likely to occur if the 
leak is incurred repeatedly in a loop or recursive function. The risk of program failure increases the longer a leaking 
program runs. Sometimes problems appear instantly; other times problems won't be seen for hours or even years 
of constant operation. Memory exhaustion failures can be catastrophic, depending on the circumstances. 


The following infinite loop is an example of a leak that will eventually exhaust available memory leak by calling 
getline(), a function that implicitly allocates new memory, without freeing that memory. 


#include <stdlib.h> 
#include <stdio.h> 


int main(void) 


{ 
char *line = NULL; 
size_t size = ð; 
/* The loop below leaks memory as fast as it can */ 
lel al Ge) | 
getline(&line, &size, stdin); /* New memory implicitly allocated */ 
/* <do whatever> */ 
line = NULL; 
} 
return @; 
} 


In contrast, the code below also uses the getline() function, but this time, the allocated memory is correctly freed, 
avoiding a leak. 


#include <stdlib.h> 
#include <stdio.h> 
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int main(void) 


{ 
char *line = NULL; 
size_t size = Q; 
fon DR 
if (getline(&line, &size, stdin) < @) { 
free(line); 
line = NULL; 
/x Handle failure such as setting flag, breaking out of loop and/or exiting */ 
} 
/* <do whatever> */ 
free(line); 
line = NULL; 
} 
return @; 
} 


Leaking memory doesn't always have tangible consequences and isn't necessarily a functional problem. While "best 
practice" dictates rigorously freeing memory at strategic points and conditions, to reduce memory footprint and 
lower risk of memory exhaustion, there can be exceptions. For example, if a program is bounded in duration and 
scope, the risk of allocation failure might be considered too small to worry about. In that case, bypassing explicit 
deallocation might be considered acceptable. For example, most modern operating systems automatically free all 
memory consumed by a program when it terminates, whether it is due to program failure, a system call to exit(), 
process termination, or reaching end of main(). Explicitly freeing memory at the point of imminent program 
termination could actually be redundant or introduce a performance penalty. 


Allocation can fail if insufficient memory is available, and handling failures should be accounted for at appropriate 
levels of the call stack. getline(), shown above is an interesting use-case because it is a library function that not 
only allocates memory it leaves to the caller to free, but can fail for a number of reasons, all of which must be taken 
into account. Therefore, it is essential when using a C API, to read the documentation (man page) and pay particular 
attention to error conditions and memory usage, and be aware which software layer bears the burden of freeing 
returned memory. 


Another common memory handling practice is to consistently set memory pointers to NULL immediately after the 
memory referenced by those pointers is freed, so those pointers can be tested for validity at any time (e.g. checked 
for NULL / non-NULL), because accessing freed memory can lead to severe problems such as getting garbage data 
(read operation), or data corruption (write operation) and/or a program crash. In most modern operating systems, 
freeing memory location 0 (NULL) is a NOP (e.g. it is harmless), as required by the C standard — so by setting a 
pointer to NULL, there is no risk of double-freeing memory if the pointer is passed to free(). Keep in mind that 
double-freeing memory can lead to very time consuming, confusing, and difficult to diagnose failures. 


Section 63.7: Copying too much 


char buf[8]; /* tiny buffer, easy to overflow */ 


printf("What is your name?\n"); 
scanf("%s", buf); /* WRONG */ 
scanf("%7s", buf); /* RIGHT */ 


If the user enters a string longer than 7 characters (- 1 for the null terminator), memory behind the buffer buf will 
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be overwritten. This results in undefined behavior. Malicious hackers often exploit this in order to overwrite the 
return address, and change it to the address of the hacker's malicious code. 


Section 63.8: Mistakenly writing = instead of == when 
comparing 


The = operator is used for assignment. 
The == operator is used for comparison. 


One should be careful not to mix the two. Sometimes one mistakenly writes 


/* assign y to x */ 
it =) 

Lk Logic */ 
} 


when what was really wanted is: 


/* compare if x is equal to y */ 
if (x == y) { 
/* logic */ 


The former assigns value of y to x and checks if that value is non zero, instead of doing comparison, which is 
equivalent to: 


at C=) E os 
/* logic */ 


There are times when testing the result of an assignment is intended and is commonly used, because it avoids 
having to duplicate code and having to treat the first time specially. Compare 


while ((c = getopt_long(argc, argv, short_options, long_options, &option_index)) != -1) { 
switch (c) { 


} 


versus 


c = getopt_long(argc, argv, short_options, long_options, &option_index) ; 
while (c != -1) { 

switch (c) { 

} 

c 


= getopt_long(argc, argv, short_options, long_options, &option_index) ; 


Modern compilers will recognise this pattern and do not warn when the assignment is inside parenthesis like 
above, but may warn for other usages. For example: 
if (x = y) /* warning */ 


in (Ey) /* no warning */ 
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if ((x = y) != Ø) /* no warning; explicit */ 


Some programmers use the strategy of putting the constant to the left of the operator (Commonly called Yoda 
conditions). Because constants are rvalues, this style of condition will cause the compiler to throw an error if the 
wrong operator was used. 


if (5 = y) /* Error */ 


if (5 == y) /* No error */ 


However, this severely reduces the readability of the code and is not considered necessary if the programmer 
follows good C coding practices, and doesn't help when comparing two variables so it isn't a universal solution. 
Furthermore, many modern compilers may give warnings when code is written with Yoda conditions. 


Section 63.9: Newline character is not consumed in typical 
scanf() call 


When this program 


#include <stdio.h> 
#include <string.h> 


int main(void) { 
int num = @; 


char str[128], *lf; 


scanf("%d", &num) ; 
fgets(str, sizeof(str), stdin); 


if ((1f = strchr(str, '\n')) != NULL) *1f = '\O'; 


printf("%d \"%s\"\n", num, str); 
return ð; 


is executed with this input 


42 
life 


the output will be 42 "" instead of expected 42 "life". 


This is because a newline character after 42 is not consumed in the call of scanf() and it is consumed by fgets() 


before it reads life. Then, fgets() stop reading before reading life. 


To avoid this problem, one way that is useful when the maximum length of a line is known -- when solving 
problems in online judge syste, for example -- is avoiding using scanf () directly and reading all lines via fgets() 
You can use sscanf() to parse the lines read. 


#include <stdio.h> 
#include <string.h> 


int main(void) { 
int num = @; 


char line_buffer[128] = "", str[128], lf; 


fgets(line_buffer, sizeof(line_buffer), stdin); 
sscanf(line_buffer, "%d", &num); 
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fgets(str, sizeof(str), stdin); 
if ((1f = strcehr(str, '\n')) != NULL) *1f = '\0'; 


printf("%d \"%s\"\n", num, str); 
return ð; 


Another way is to read until you hit a newline character after using scanf() and before using fgets(). 


#include <stdio.h> 
#include <string.h> 


int main(void) { 
int num = @; 
char str[128], *1f; 
raps do 
scanf("%d", &num) ; 
while ((c = getchar()) != '\n' && c != EOF); 
fgets(str, sizeof(str), stdin); 
if ((1f = strcehr(str, '\n')) != NULL) *1f = 1107; 


printf("%d \"%s\"\n", num, str); 
return ð; 


Section 63.10: Adding a semicolon to a #define 


It is easy to get confused in the C preprocessor, and treat it as part of C itself, but that is a mistake because the 
preprocessor is just a text substitution mechanism. For example, if you write 


/* WRONG */ 
#define MAX 100; 
int arr[MAX]; 


the code expands to 
int arr[100;]; 


which is a syntax error. The remedy is to remove the semicolon from the #define line. It is almost invariably a 
mistake to end a #define with a semicolon. 


Section 63.11: Incautious use of semicolons 


Be careful with semicolons. Following example 


actually means: 


if (64 2 ey) ae 
a = x? 


which means x will be assigned to a in any case, which might not be what you wanted originally. 


Sometimes, missing a semicolon will also cause an unnoticeable problem: 
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if (Gi < @)) 

return 
day = date[@]; 
hour = date[1]; 
minute = date[2]; 


The semicolon behind return is missed, so day=date[0] will be returned. 


One technique to avoid this and similar problems is to always use braces on multi-line conditionals and loops. For 
example: 


Section 63.12: Undefined reference errors when linking 
One of the most common errors in compilation happens during the linking stage. The error looks similar to this: 


$ gcc undefined_reference.c 

/tmp/ccoXhwF0.o: In function `main': 
undefined_reference.c:(.text+0x15): undefined reference to `foo' 
collect2: error: ld returned 1 exit status 


$ 


So let's look at the code that generated this error: 
int foo(void) ; 


int main(int argc, char **argv) 


{ 
int foo_val; 
foo_val = foo(); 
return foo_val; 
} 


We see here a declaration of foo (int foo() ;) but no definition of it (actual function). So we provided the compiler 
with the function header, but there was no such function defined anywhere, so the compilation stage passes but 
the linker exits with an Undefined reference error. 

To fix this error in our small program we would only have to add a definition for foo: 


/* Declaration of foo */ 
int foo(void) ; 


/* Definition of foo */ 
int foo(void) 


{ 
return 5; 
} 
int main(int argc, char **xargv) 
{ 
int foo_val; 
foo_val = foo(); 
return foo_val; 
} 
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Now this code will compile. An alternative situation arises where the source for foo() is in a separate source file 
foo.c (and there's a header foo.h to declare foo() that is included in both foo.c and undefined_reference.c). 
Then the fix is to link both the object file from foo.c and undefined_reference.c, or to compile both the source 
files: 


$ gcc -c undefined_reference.c 

$ gcc -c foo.c 

$ gcc -o working_program undefined_reference.o foo.o 
$ 

Or: 


$ gcc -o working_program undefined_reference.c foo.c 


$ 
A more complex case is where libraries are involved, like in the code: 


#include <stdio.h> 
#include <stdlib.h> 
#include <math.h> 


int main(int argc, char **argv) 


{ 
double first; 
double second; 
double power; 
if (argc != 3) 
{ 
fprintf(stderr, "Usage: %s <denom> <nom>\n", argv[@]); 
return EXIT_FAILURE; 
} 
/* Translate user input to numbers, extra error checking 
* should be done here. */ 
first = strtod(argv[1], NULL); 
second = strtod(argv[2], NULL); 
/* Use function pow() from libm - this will cause a linkage 
* error unless this code is compiled against libm! */ 
power = pow(first, second) ; 
printf("%f to the power of %f = %f\n", first, second, power); 
return EXIT SUCCESS: 
} 


The code is syntactically correct, declaration for pow() exists from #include <math.h>, so we try to compile and link 
but get an error like this: 


$ gcc no_library_in_link.c -o no_library_in_link 
/tmp/ccduQQqA.o: In function ‘main’ : 
no_library_in_link.c:(.text+@x8b): undefined reference to `pow' 
collect2: error: ld returned 1 exit status 


$ 


This happens because the definition for pow() wasn't found during the linking stage. To fix this we have to specify 
we want to link against the math library called 1libm by specifying the -1m flag. (Note that there are platforms such 
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as macOS where -1m is not needed, but when you get the undefined reference, the library is needed.) 


So we run the compilation stage again, this time specifying the library (after the source or object files): 


$ gcc no_library_in_link.c -lm -o library_in_link_cmd 
$ ./library_in_link_cmd 2 4 

2.000000 to the power of 4.900000 = 16.0000090 

$ 


And it works! 


Section 63.13: Checking logical expression against 'true' 


The original C standard had no intrinsic Boolean type, so bool, true and false had no inherent meaning and were 
often defined by programmers. Typically true would be defined as 1 and false would be defined as 0. 


Version = C99 


C99 adds the built-in type -Bool and the header <stdbool.h> which defines bool (expanding to _Bool), false and 
true. It also allows you to redefine bool, true and false, but notes that this is an obsolescent feature. 


More importantly, logical expressions treat anything that evaluates to zero as false and any non-zero evaluation as 
true. For example: 


/* Return 'true' if the most significant bit is set */ 
bool isUpperBitSet(uint8_t bitField) 
{ 
if ((bitField & 0x80) == true) /* Comparison only succeeds if true is 0x80 and bitField has 
that bit set */ 


{ 

return true; 
} 
else 
{ 

return false; 
} 


In the above example, the function is trying to check if the upper bit is set and return true if it is. However, by 
explicitly checking against true, the if statement will only succeed if (bitfield & 0x80) evaluates to whatever 
true is defined as, which is typically 1 and very seldom 9x89. Either explicitly check against the case you expect: 


/* Return ‘true' if the most significant bit is set */ 
bool isUpperBitSet(uint8_t bitField) 


{ 
if ((bitField & 0x80) == @x8@) /* Explicitly test for the case we expect */ 
{ 
return true; 
} 
else 
{ 
return false; 
} 
} 


Or evaluate any non-zero value as true. 


Goalkicker.com - C Notes for Professionals Siz 


/* Return ‘true' if the most significant bit is set */ 
bool isUpperBitSet(uint8_t bitField) 


{ 
/* If upper bit is set, result is @x8@ which the if will evaluate as true */ 
if (bitField & 0x890) 
{ 
return true; 
} 
else 
{ 
return false; 
} 
} 


Section 63.14: Doing extra scaling in pointer arithmetic 


In pointer arithmetic, the integer to be added or subtracted to pointer is interpreted not as change of address but as 
number of elements to move. 


#include <stdio.h> 


int main(void) { 
alae. cleret = sb as eh ZUR ae 
int *ptr = &array[®@]; 
int *ptr2 = ptr + sizeof(int) * 2; /* wrong */ 
printf("%d %d\n", *ptr, *ptr2); 
return @; 


This code does extra scaling in calculating pointer assigned to ptr2. If sizeof (int) is 4, which is typical in modern 
32-bit environments, the expression stands for "8 elements after array[9]", which is out-of-range, and it invokes 
undefined behavior. 


To have ptr2 point at what is 2 elements after array[ 9], you should simply add 2. 


#include <stdio.h> 


int main(void) { 
ine Glee whl] ashlee Sin Zl Eye 
int *ptr = &array[@]; 
int *ptr2 = ptr + 2; 
printf("%d %d\n", *ptr, *ptr2); /* "7 3" will be printed */ 
return ð; 


Explicit pointer arithmetic using additive operators may be confusing, so using array subscripting may be better. 
#include <stdio.h> 


int main(void) { 
int arcay ES Sl, vee eh ZU aye 
int *ptr = &array[@]; 
int *ptr2 = &ptr[2]; 
printf("%d %d\n", *ptr, *ptr2); /* "7 3" will be printed */ 
return ð; 


E1[E2] is identical to (*((E1)+(E2))) (N1570 6.5.2.1, paragraph 2), and &(E1[E2]) is equivalent to ((E1)+(E2) ) 
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(N1570 6.5.3.2, footnote 102). 


Alternatively, if pointer arithmetic is preferred, casting the pointer to address a different data type can allow byte 


addressing. Be careful though: endianness can become an issue, and casting to types other than ‘pointer to 
character' leads to strict aliasing problems. 


#include <stdio.h> 


int main(void) { 
int array[3] = {1,2,3}; // 4 bytes * 3 allocated 
unsigned char *ptr = (unsigned char *) array; // unsigned chars only take 1 byte 
IR 
* Now any pointer arithmetic on ptr will match 
* bytes in memory. ptr can be treated like it 
* was declared as: unsigned char ptr[12]; 
*/ 


return @; 


Section 63.15: Multi-line comments cannot be nested 
In C, multi-line comments, /* and */, do not nest. 


If you annotate a block of code or function using this style of comment: 


* max(): Finds the largest integer in an array and returns it. 
x If the array length is less than 1, the result is undefined. 
* arr: The array of integers to search. 
* num: The number of integers in arr. 
*/ 
int max(int arr[], int num) 
{ 
int max = arr[@]; 
for (int i = 0; i < num; i++) 
if (arr[i] > max) 
max = arr[i]; 
return max; 


You will not be able to comment it out easily: 


//Trying to comment out the block... 
/* 


/* 
* max(): Finds the largest integer in an array and returns it. 
x If the array length is less than 1, the result is undefined. 
* arr: The array of integers to search. 
* num: The number of integers in arr. 
*/ 
int max(int arr[], int num) 
{ 
int max = arr[@]; 
for (int i = 0; i < num; i++) 
if (arr[i] > max) 
max = arrlil; 
return max; 
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} 


//Causes an error on the line below... 
*/ 


One solution is to use C99 style comments: 


// max(): Finds the largest integer in an array and returns it. 
// If the array length is less than 1, the result is undefined. 


// arr: The array of integers to search. 
// num: The number of integers in arr. 
int max(int arr[], int num) 


{ 
int max = arr[@]; 
for (int i = 0; i < num; i++) 
if (arr[i] > max) 
max = arr[il; 
return max; 
} 


Now the entire block can be commented out easily: 


fF 


// max(): Finds the largest integer in an array and returns it. 


// If the array length is less than 1, the result is undefined. 
// arr: The array of integers to search. 

// num: The number of integers in arr. 

int max(int arr[], int num) 


{ 
int max = arr[@]; 
for (ink t= GF a S MUM a+) 
if (arr[i] > max) 
max = arr[i]; 
return max; 
} 
*/ 


Another solution is to avoid disabling code using comment syntax, using #ifdef or #ifndef preprocessor directives 


instead. These directives do nest, leaving you free to comment your code in the style you prefer. 


#define DISABLE_MAX /* Remove or comment this line to enable max() code block */ 


#ifdef DISABLE_MAX 
ak 
* max(): Finds the largest integer in an array and returns it. 
If the array length is less than 1, the result is undefined. 
* arr: The array of integers to search. 
* num: The number of integers in arr. 
*/ 
int max(int arr[], int num) 


{ 


%* 


int max = arr[@]; 
for (int i = 0; i < num; i++) 
if (arr[i] > max) 
max = arr[i]; 
return max; 
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#endif 


Some guides go so far as to recommend that code sections must never be commented and that if code is to be 
temporarily disabled one could resort to using an #if 9 directive. 


See #if 0 to block out code sections. 


Section 63.16: Ignoring return values of library functions 


Almost every function in C standard library returns something on success, and something else on error. For 
example, malloc will return a pointer to the memory block allocated by the function on success, and, if the function 
failed to allocate the requested block of memory, a null pointer. So you should always check the return value for 
easier debugging. 


This is bad: 


char» x = malloc(10000000000OUL * sizeof *x); 

/* more code */ 

scanf("%s", x); /* This might invoke undefined behaviour and if lucky causes a segmentation 
violation, unless your system has a lot of memory */ 


This is good: 


#include <stdlib.h> 
#include <stdio.h> 


int main(void) 


{ 
charx x = malloc(10000000000OUL * sizeof *x); 
if (x == NULL) { 
perror("malloc() failed"); 
exit (EXIT_FAILURE) ; 
} 
if (scanf("%s", x) != 1) { 
fprintf(stderr, "could not read string\n"); 
free(x); 
exit (EXIT_FAILURE) ; 
} 
/* Do stuff with x. */ 
/* Clean up. */ 
free(x); 
return EXIT_SUCCESS; 
} 


This way you know right away the cause of error, otherwise you might spend hours looking for a bugin a 
completely wrong place. 


Section 63.17: Comparing floating point numbers 


Floating point types (float, double and long double) cannot precisely represent some numbers because they have 
finite precision and represent the values in a binary format. Just like we have repeating decimals in base 10 for 
fractions such as 1/3, there are fractions that cannot be represented finitely in binary too (such as 1/3, but also, 
more importantly, 1/10). Do not directly compare floating point values; use a delta instead. 
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#include <float.h> // for DBL_EPSILON and FLT_EPSILON 


#include <math.h> // for fabs() 


int main(void) 


printf("10 * 0.1 is indeed 1.0. This is not guaranteed in the general case.\n"); 


10 decimal digits 


{ 
double a = 0.1; // imprecise: (binary) 0.000119... 
// may be false or true 
if (at a tat a ba + a boaet a ta + a == 10) 4 
} 
// Using a small delta value. 
if (fabs(at+atatatatatatatata- 1.0) < 0.000001) { 
// C99 5.2.4.2.2p8 guarantees at least 
// of precision for the double type. 
printf("10 * 0.1 is almost 1.@.\n"); 
} 
return @; 
} 


Another example: 


gcc -03 -g -I./inc -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes 


-Wold-style-definition 
#include <stdio.h> 
#include <math.h> 


static inline double rel_diff(double a, double b) 


fabs(b)); 


rd11.c -o rd11 -L./lib -1lsoq 


printf ("%d:%.10f <=> %.10f within tolerance %.10f (rel diff %.4E)\n", 
rel_diff(d1, d2)); 


printf("%d:%.10f <=> %.10f out of tolerance %.10f (rel diff %.4E)\n", 
rel_diff(d1, d2)); 


{ 
return fabs(a - b) / fmax(fabs(a), 
} 
int main(void) 
{ 
double d1 = 3.14159265358979; 
double d2 = 355.0 / 113.0; 
double epsilon = 1.0; 
for (int i = @; i < 10; i++) 
{ 
if (rel_diff(d1, d2) < epsilon) 
i, d1, d2, epsilon, 
else 
i, d1, d2, epsilon, 
epsilon /= 10.0; 
} 
return ð; 
} 
Output: 
0:3.1415926536 <=> 3.1415929204 within 
1:3.1415926536 <=> 3.1415929204 within 
2:3.1415926536 <=> 3.1415929204 within 
3:3.1415926536 <=> 3.1415929204 within 
4:3.1415926536 <=> 3.1415929204 within 
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tolerance 
tolerance 
tolerance 
tolerance 
tolerance 


1. 0000000000 
0.1000000000 
0.0100000000 
0 .0010000000 
0 .0001000000 


rel diff 8.4914E-08 
rel diff 8.4914E-08 
rel diff 8.4914E-08 
rel diff 8.4914E-08 
(rel diff 8.4914E-08 


( 
( 
( 
( 


) 
) 
) 
) 
) 


522 


. 1415926536 <=> 
. 1415926536 <=> 
. 1415926536 <=> 
. 1415926536 <=> 
. 1415926536 <=> 


. 1415929204 within tolerance 0.0000100000 (rel diff 8.4914E-08) 
. 1415929204 within tolerance 0.0000010000 (rel diff 8.4914E-08) 
. 1415929204 within tolerance 0.0000001000 (rel diff 8.4914E-08) 
. 1415929204 out of tolerance 0.0000000100 (rel diff 8.4914E-08) 
. 1415929204 out of tolerance 0.0000000010 (rel diff 8.4914E-08) 


OmenNau 
w w www 
w w www 


Section 63.18: Floating point literals are of type double by 
default 


Care must be taken when initializing variables of type float to literal values or comparing them with literal values, 
because regular floating point literals like 0.1 are of type double. This may lead to surprises: 


#include <stdio.h> 
int main() { 
float m 
n = 0.1; 
if (n > 0.1) printf("Wierd\n"); 
return ð; 


} 


// Prints "Wierd" when n is float 


Here, n gets initialized and rounded to single precision, resulting in value 0.10000000149011612. Then, nis 
converted back to double precision to be compared with @.1 literal (which equals to 0.10000000000000001), 
resulting in a mismatch. 


Besides rounding errors, mixing float variables with double literals will result in poor performance on platforms 
which don't have hardware support for double precision. 


Section 63.19: Using character constants instead of string 
literals, and vice versa 


In C, character constants and string literals are different things. 


A character surrounded by single quotes like 'a' is a character constant. A character constant is an integer whose 
value is the character code that stands for the character. How to interpret character constants with multiple 
characters like ‘abc’ is implementation-defined. 


Zero or more characters surrounded by double quotes like "abc" is a string literal. A string literal is an unmodifiable 
array whose elements are type char. The string in the double quotes plus terminating null-character are the 
contents, so "abc" has 4 elements ({'a', 'b', ‘c', '\@'}) 


In this example, a character constant is used where a string literal should be used. This character constant will be 
converted to a pointer in an implementation-defined manner and there is little chance for the converted pointer to 
be valid, so this example will invoke undefined behavior. 


#include <stdio.h> 


int main(void) { 
const char *hello = ‘hello, world'; /* bad */ 
puts(hello) ; 
return ð; 


In this example, a string literal is used where a character constant should be used. The pointer converted from the 
string literal will be converted to an integer in an implementation-defined manner, and it will be converted to char 
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in an implementation-defined manner. (How to convert an integer to a signed type which cannot represent the 
value to convert is implementation-defined, and whether char is signed is also implementation-defined.) The 
output will be some meaningless thing. 


#include <stdio.h> 


int main(void) { 


char c = "a"; /* bad */ 
printi (hcin wc). 
return ð; 


In almost all cases, the compiler will complain about these mix-ups. If it doesn't, you need to use more compiler 
warning options, or it is recommended that you use a better compiler. 


Section 63.20: Recursive function — missing out the base 
condition 


Calculating the factorial of a number is a classic example of a recursive function. 
Missing the Base Condition: 


#include <stdio.h> 


int factorial(int n) 


: return n * factorial(n - 1); 

} 

int main() 

{ 
printf("Factorial %d = %d\n", 3, factorial(3)); 
return ð; 

} 


Typical output: Segmentation fault: 11 


The problem with this function is it would loop infinitely, causing a segmentation fault — it needs a base condition 
to stop the recursion. 


Base Condition Declared: 


#include <stdio.h> 


int factorial(int n) 


{ 
if (n == 1) // Base Condition, very crucial in designing the recursive functions. 
{ 
return 1; 
} 
else 
{ 
return n * factorial(n - 1); 
} 
} 


int main() 


{ 
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printf("Factorial %d = %d\n", 3, factorial(3)); 
return @; 


Sample output 
Factorial 3 = 6 


This function will terminate as soon as it hits the condition n is equal to 1 (provided the initial value of n is small 
enough — the upper bound is 12 when int is a 32-bit quantity). 


Rules to be followed: 


1. Initialize the algorithm. Recursive programs often need a seed value to start with. This is accomplished either 
by using a parameter passed to the function or by providing a gateway function that is non-recursive but that 
sets up the seed values for the recursive calculation. 

2. Check to see whether the current value(s) being processed match the base case. If so, process and return the 

value. 

. Redefine the answer in terms of a smaller or simpler sub-problem or sub-problems. 

. Run the algorithm on the sub-problem. 

. Combine the results in the formulation of the answer. 

. Return the results. 


OUA UW 


Source: Recursive Function 


Section 63.21: Overstepping array boundaries 


Arrays are zero-based, that is the index always starts at 0 and ends with index array length minus 1. Thus the 
following code will not output the first element of the array and will output garbage for the final value that it prints. 


#include <stdio.h> 


int main(void) 


{ 
int x = ð; 
int myArray[5] = {1, 2, 3, 4, 5}; //Declaring 5 elements 
for(x = 1; x <= 5; x++) //Looping from 1 till 5. 
printf("%d\t", myArray[x]); 
printf("\n"); 
return ð; 
} 


Output: 2 3 4 5 GarbageValue 


The following demonstrates the correct way to achieve the desired output: 


#include <stdio.h> 


int main(void) 
{ 
int x = @; 
int myArray[5] = {1, 2, 3, 4, 5}; //Declaring 5 elements 


for(x = 0; x < 5; x++) //Looping from @ till 4. 
printf ("%d\t", myArray[x]); 
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printf("\n"); 
return ð; 


Output: 1 2 3 4 5 


It is important to know the length of an array before working with it as otherwise you may corrupt the buffer or 
cause a segmentation fault by accessing memory locations that are out of bounds. 


Section 63.22: Passing unadjacent arrays to functions 
expecting "real” multidimensional arrays 


When allocating multidimensional arrays with malloc, calloc, and realloc, a common pattern is to allocate the 
inner arrays with multiple calls (even if the call only appears once, it may be in a loop): 


/* Could also be “int *** with malloc used to allocate outer array. */ 
int *array[4]; 
int abe 


/*x Allocate 4 arrays of 16 ints. */ 
for Gi = Oe i < A; itd) 
array[i] = malloc(16 * sizeof(*array[i])); 


The difference in bytes between the last element of one of the inner arrays and the first element of the next inner 
array may not be 0 as they would be with a "real" multidimensional array (e.g. int array[4][16] ;): 


/* 0x40003c, @x402000 */ 
printf("%p, %p\n", (void *)(array[@] + 15), (void *)array[1]); 


Taking into account the size of int, you get a difference of 8128 bytes (8132-4), which is 2032 int-sized array 
elements, and that is the problem: a "real" multidimensional array has no gaps between elements. 


If you need to use a dynamically allocated array with a function expecting a "real" multidimensional array, you 
should allocate an object of type int * and use arithmetic to perform calculations: 


void func(int M, int N, int *array); 


/* Equivalent to declaring ‘int array[M][N] = {{@}};° and assigning to array4_16[i][j]. */ 
int *array; 

int M = 4, N = 16; 

array = calloc(M, N * sizeof(*array)) ; 

array[i xN +j] = 1; 

func(M, N, array); 


If N is a macro or an integer literal rather than a variable, the code can simply use the more natural 2-D array 
notation after allocating a pointer to an array: 


void func(int M, int N, int *array); 
#define N 16 
void func_N(int M, int (*array)[N]); 


int M = 4; 

int (*array)[N]; 

array = calloc(M, sizeof(*array)) ; 
array[i][j] = 1; 
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/* Cast to “int *` works here because ‘array’ is a single block of M*#N ints with no gaps, 
just like ‘int array2[M * N];` and ‘int array3[M][N];° would be. */ 

func(M, N, (int *)array); 

func_N(M, array); 


Version = C99 


If N is not a macro or an integer literal, then array will point to a variable-length array (VLA). This can still be used 
with func by casting to int * and a new function func_vla would replace func_N: 


void func(int M, int N, int *array); 
void func_vla(int M, int N, int array[M][N]); 


int M = 4, N = 16; 

int (*array)[N]; 

array = calloc(M, sizeof(*array)) ; 
array[i}[j] = 1; 

func(M, N, (int *)array); 
func_vla(M, N, array); 


Version = C11 


Note: VLAs are optional as of C11. If your implementation supports C11 and defines the macro _.STDC_NO_VLA__ to 
1, you are stuck with the pre-C99 methods. 
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